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Description 

[0001] The present invention relates generally to database management systems, and particularly to a database 
management system in which database table availability is maintained, with minimal or no user availability outages 
5 during table restructuring operations such as splitting a table or index partition, moving an existing table or index 
partition, creating a new index and moving a table or index partition boundary. 

BACKGROUND OF THE INVENTION 

10 [0002] Database configuration and reconfiguration operations can have a significant effect on the availability of user 
applications that need access to databases undergoing structural changes. The Tandem™ NonStop™ SQUMP rela- 
tional database management system (DBMS), priorto the present invention, allowed read access, but not write access, 
to the portions of the database table undergoing the restructuring operation. 

[0003] Although most users perform these operations infrequently, their duration can account for thousands of min- 
ts utes of application outages per year. A discussion of the cost of application outages appears in the article "An Overview 
of NonStop SQUMP," Ho et al., Tandem Systems Review, July 1994. 

[0004] Embodiments of the present invention eliminate most of the downtime associated with four database recon- 
figuration operations: Move Partition, Split Partition, Move Partition Boundary, and Create Index. The Move Partition 
procedure moves a partition that resides on one disk to another disk. The Split Partition procedure splits one partition 
20 jnto two. 

The Move Partition Boundary procedure moves rows of a base table between adjacent partitions, typically shifting rows 
from a large partition to a less large partition. The Create Index procedure creates an efficient alternate access path 
to a database table by ordering data according to the value specified in the key columns. The Create Unique Index 
variant of this procedure ensures that there is exactly one alternate access path to each record in the database table. 
25 [0005] The implementation of these procedures in embodiments of the invention substantially reduces, but does not 
eliminate, associated outages. Even with the present invention, user database activity continues to be restricted for 
about one minute or less per database restructuring operation. The outage time varies depending on the number of 
user transactions running against the table being restructured, the size of those transactions, and the number of par- 
titions in the affected table. 

30 [0006] Embodiments of the present invention enable substantially improved user transaction access to a database 
table while the table is undergoing a structural change operation. 

[0007] Embodiments of the present invention also enable Move Partition, Split Partition, Move Partition Boundary, 
and Create Index operations to be performed on a database table while enabling user transactions to continue to be 
performed on the table, except during a short final phase of those operations. 
35 [0008] Embodiments of the present invention also avoid the use of "side files" for the above mentioned database 
reconfiguration operations, so as to reduce the overhead disk space requirements associated with these operations 
to almost zero. 

[0009] Embodiments of the present invention also implement the above mentioned database reconfiguration oper- 
ations using procedures that directly read from the transaction audit log, thereby making these database reconfiguration 

40 operations similar to a database recovery process. 

[0010] Embodiments of the present invention also implement the database reconfiguration operations using well 
established, optimized, database recovery process procedures thereby making the database reconfiguration opera- 
tions efficient in terms of computation time and in terms of the computer resources used. Embodiments of the present 
invention also utilize substantially similar database reconfiguration procedures for four distinct database reconfiguration 

45 operations, thereby improving the reliability of those procedures and simplifying maintenance of and updates to those 
procedures. 

SUMMARY OF THE INVENTION 

so [001 1] Accordingly, the present invention provides a computer system for storing and providing user access to data 
in stored database objects, comprising:memory for storing said database objects, said database objects comprising 
database tables, database partitions and database indexes, wherein said memory resides in a plurality of intercon- 
nected computer nodes; a central processing unit; a transaction manager, coupled to said memory, for managing 
computational transactions that add, delete and alter data stored in said database objects; said transaction manager 

55 including audit trail generation instructions for generating an audit trail and storing said audit trail in said memory, said 
audit trail including audit records, at least a subset of said audit records each denoting an event selected from the set 
consisting essentially of addition, deletion and alteration of specified data in a specified one of said database objects; 
a restructuring procedure, executable by said cental processing unit, for restructuring a specified one of said database 
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objects; 

characterized in that said restructuring procedure includes: 

first phase instructions for accessing data in a first of said objects and storing corresponding data in a second of 
said objects while allowing continued performance of computational transactions against said first object; second phase 

5 instructions, for execution after said first phase instructions, for accessing said audit records in said audit trail created 
by said transaction manager during execution of said first phase instructions and thereafter until execution of said 
second phase instructions is completed; said second phase instructions updating said data stored in said second object 
by redoing with respect to said second object each event denoted by said accessed audit trail records; and third phase 
instructions, for execution after said second phase instructions, for obtaining a lock on said first object so as to prevent 

10 continued performance of computational transactions against said first object, and for then accessing audit records in 
said audit trail created by said transaction manager after execution of said second phase instructions; said third phase 
instructions updating said data in said second object by redoing with respect to said second object each event denoted 
by said accessed audit trail records. 

[0012] A preferred embodiment of the invention comprises a set of procedures for modifying the structure of a da- 
15 tabase table or index "online," while the database table or index remains available for execution of transactions, with 
minimal impact on the availability of the database table for transaction execution. 

[0013] The preferred embodiment operates in a database computer system having memory, residing in a plurality 
of interconnected computer nodes, for storing database tables. Each database table or index has a plurality of columns, 
a primary key index based on a specified subset of the columns, and an associated tabie schema. In most implemen- 
20 tations, at least some of the database tables or indexes are partitioned into a plurality of partitions, each partition storing 
records having primary key values in a primary key range distinct from the other partitions. 

[0014] A transaction manager generates and stores an audit trail, each audit entry denoting a database table or 
index record event, such as an addition, deletion or alteration of a specified database table or index record in a specified 
one of the database tables or indexes. 

25 [0015] Four online data definition procedures allow the structure of a database table or index to be altered while the 
database table or index remains available for execution of transactions, with minimal impact of the availability of the 
database table or index for transaction execution. The four online data definition procedures are a Move Partition 
procedure, a Split Partition procedure, a Move Partition Boundary procedure, and a Create Index procedure. Each of 
these online procedures has several phases of execution. In a first phase, the definitions of existing objects are read 

30 and any new objects that are needed to perform the requested procedure are created. Then, records of a table or index 
partition or the entire table are accessed, using "browse" access, so as to generate a new partition, to move records 
between two partitions, or to create a new index. Browse access is a form of read access that allows database records 
to be read through any record locks that may be in place. Thus, browse access will sometimes result in the reading of 
records that are in the midst of being modified. 

35 [0016] In a second phase, audit trail entries are accessed and the equivalent database table or index operations 
denoted in those audit trail entries are redone on the target objects, whenever necessary, to bring the data records 
created during the first phase up-to-date. In a third phase, access to the database table is briefly locked (A) while audit 
trail entries created after the second phase are used to make final changes to the previously created data records, 
and then (B) while the database table or index schema is updated to reflect the changes to the database table or index 

^o produced. 

[0017] In a fourth phase, used by the Move Partition Boundary and Split Partition procedures, records in a database 
partition that are inconsistent with the modified database table schema are deleted as a background operation while 
use of the database table by transactions resumes. 

45 BRIEF DESCRIPTION OF THE DRAWINGS 

[0018] Embodiments of the invention will now be described, by way of example only, with reference to the accom- 
panying drawings, of which: 

50 Figure 1 is a block diagram of a computer system having a database management system in accordance with the 

present invention. 

Figure 2A is a block diagram of a database table. Figure 2B is a block diagram of an alternate index. Figure 2C is 
a block diagram of the data structure of an Audit Trail. 

55 

Figure 3 is a conceptual diagram of the three primary phases of the database restructuring procedures in the 
present invention. 
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Figure 4 is a flowchart of the Move Partition procedure used in a preferred embodiment of the present invention. 

Figure 5 is a flowchart of the Split Partition procedure used in a preferred embodiment of the present invention. 

5 Figure 6 is a flowchart of the Move Partition Boundary procedure used in a preferred embodiment of the present 

invention. 

Figure 7 is a flowchart of the Create Index procedure used in a preferred embodiment of the present invention. 

10 DESCRIPTION OF THE PREFERRED EMBODIMENTS 

[001 9] Referring to Figure 1 , there is shown a computer system 1 00 for storing and providing user access to data in 
stored databases. The system 100 is a distributed computer system having multiple computers 102, 104, 106 inter- 
connected by local area and wide area network communication media 1 08. The system 1 00 generally includes at least 

15 one database server 1 02 and many user workstation computers or terminals 1 04, 1 06. 

[0020] When very large databases are stored in a system, the database tables will be partitioned, and different 
partitions of the database tables will often be stored in different database servers. However, from the viewpoint of user 
workstation computers 1 04, 1 06, the database server 1 02 appears to be a single entity. The partitioning of databases 
and the use of multiple database servers is well known to those skilled in the art. 

20 [0021] As shown in Figure 1 , the database server 102 includes a central processing unit (CPU) 110, primary memory 
112, a communications interface 114 for communicating with user workstations 104, 106 as well as other system re- 
sources not relevant here. Secondary memory 116-1, 116-2, typically magnetic disc storage, in the database server 
1 02 stores database tables 1 20, database indices 1 22, a database management system (DBMS) 1 24 for management 
of the database tables and associated data structures and resources, and one or more catalogs 1 26 for storing schema 

25 information about the database tables 120 as well directory information for programs used to access the database 
tables. The DBMS 124 includes an SQL executor 128 for executing SQL statements (i.e., database queries) and an 
SQL catalog manager 1 30 for maintenance of the catalogs 1 26 and for performing database definition and restructuring 
operations. The SQL catalog manager includes the "online DDL procedures" 132 of the present invention for restruc- 
turing database tables while providing improved user transaction access to the affected tables. 

30 [0022] The database server 102 further includes a transaction manager 1 34 for managing transactions, and appli- 
cation programs 136 that are utilized by users to perform transactions that utilize the database tables 120. The trans- 
action manager creates audit entries for each transaction, which are durably stored in an audit trail file 138 in secondary 
memory. 

[0023] End user workstations 1 04, 106, typically include a central processing unit (CPU) 140, primary memory 142, 
35 a communications interface 144 for communicating with the database server 102 and other system resources, sec- 
ondary memory 1 46, and a user interface 1 48. The user interface 1 48 typically includes a keyboard and display device, 
and may include additional resources such as a pointing device and printer. Secondary memory 146 is used for storing 
computer programs, such as communications software used to access the database server 1 02. Some end user work- 
stations 1 06 may be "dumb" terminals that do not include any secondary memory 1 46, and thus execute only software 
40 downloaded into primary memory 142 from a server computer, such as the database server 1 02 or a file server (not 
shown). 

Glossary 

45 [0024] To assist the reader, the following glossary of terms used in this document is provided. 

[0025] SQL: SQL stands for "Structured Query Language." Most commercial database servers utilize SQL. Any 
program for accessing data in a database that utilizes SQL is herein called an "SQL Program." Each statement in an 
SQL program used to access data in a database is called an "SQL statement." 

[0026] Object(s): An object is a file, database table or other encapsulated computer resource accessed by a program 
50 as a unitary structure. In the context of the preferred embodiment, objects are database tables or indexes. In other 
implementations of the present invention, objects may be other encapsulated computer resources that the end user 
accesses indirectly through validated methods (i.e., programs) designed specifically to access those computer resourc- 
es. 

[0027] DDL Statement: a data definition language statement. DDL statements are used to create and modify data- 
55 base tables. 

[0028] End user: a person using a workstation to access database information in a database server. End users 
typically do not have the authority to modify the structure of database tables. 

[0029] Operator: a person using a workstation who has the authority and access rights to modify the structure of 
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database tables and to manually initiate compilation of SQL source code programs. 
Audit Trial, Database Table and Index Data Structures 

5 [0030] Figure 2A shows the data structure of a typical database table 120-1 . The table 120-1 includes a file label 
1 60, which is essentially a compactly stored copy of the catalog information for the database table, which represents 
the table's schema as well as other information not relevant here. Next, the table includes a primary key B-tree 162. 
The use of B-trees in database files is well known to those of ordinary skill in the art of database management systems. 
Next, the table has a data array 1 63 organized in rows and columns. The rows of the table are often called "records" 1 64. 

w [0031] In the context of preferred embodiments of the present invention, every database table has a primary index. 
The value of the primary index for a particular record is called the primary key, and the primary key is typically equal 
to either (A) the value in one field (i.e., column), (B) the concatenation of the values in a plurality of columns, or (C) a 
computed function of the values in one or more columns. 

The set of columns used to generate the primary key are represented by a vector herein called the PrimaryKeyColumns 
15 vector. There is a fl Create_PrimaryKey w function for every database table, represented as follows: 

PrimaryKey 

20 = Create_PrimaryKey(BaseTable(RecPtr), PrimaryKeyColumns) 

where RecPtr is a pointer to a database table record. 

[0032] It is often the case that an application program needs to access a database table in accordance with a set of 
column values, at least some of which are not included in the primary index. When that is the case, a Create Index 
25 procedure can be used to create an efficient alternate access path to the database table by ordering data according 
to the values in any specified set of columns. That ordering is represented by an "Alternate Index," which is typically 
implemented as a separate data structure from the associated database table. 

[0033] Figure 2B shows the data structure of an alternate index 170. The alternate index 170 includes a file label 
1 72, which includes a compactly stored copy of the catalog information for the index. The alternate index also includes 
30 an alternate key B-tree 1 74 and then a data array 1 76 organized in rows and columns. The data array has two sets of 
columns herein called the AltKey columns and the PrimaryKey columns. 

[0034] The rows of the data array 176 are called records 178, and each row of the Alternate Index corresponds to 
one record of the associated database table. Furthermore, each row of the Alternate Index has two fields: one repre- 
sents the alternate key value for the corresponding database table record, and one represents the Primary Key value 
35 for the same database table record. 

[0035] The set of columns used to generate the alternate key for a particular database table are represented by a 
vector herein called the Alternate Key Columns vector. There is a "Create_AltKey" function for every alternate index of 
any database table, represented as follows: 

40 

AltKey = Create_AltKey(BaseTable(RecPtr), PrimaryKeyColumns) 
where RecPtr is a pointer to a database table record. 

[0036] For the purposes of the Move Partition, Split Partition and Move Partition Boundary operations, a database 
45 jndex may be viewed in the same way as a database table. 

[0037] Figure 2C shows the data structure of an Audit Trail 1 38-1 . The Audit Trail includes a file label 1 82 and a set 
of sequentially generated and stored audit entries 1 84. Each audit entry 1 84 denotes a database table or index record 
event such as the addition, deletion or alteration of a specified database table or index record in a specified database 
table or index. 

50 

Database Table Alteration Procedures 

[0038] Figure 3 is a conceptual representation of the procedure for modifying a database table or index. The com- 
mands for making database table or index alterations are called data definition language (DDL) statements. In the 
55 preferred embodiment, the DDL statements used are not changed, except for the use of an "ONLINE" option in the 
DDL statements to indicate that the DDL operation is to be performed while minimizing the impact of the DDL operation 
on usertransactions. When the ONLINE option is specified, the preferred embodiment of the present invention changes 
how the SQL catalog manager 130 executes the Move Partition, Split Partition, Move Partition Boundary and Create 
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Index commands. 

[0039] The procedure for modifying a database table's structure begins with a user or operator entering a DDL state- 
ment, specifying an alteration in the schema of a specified object 200, which is typically either a database table, an 
index, or a partition of a database table or index. I he specified object is accessed to read its file iabei, and if xhe 
5 command requires generating a new object 202, the new object is created. If the command involves movement of data 
between the first object 200 and a second object 202 that already exists, the file label of the second object is also 
accessed. 

[0040] In the first phase of execution of the "online DDL command," a "dirty copy" of data from the first object 200 
into the second object 202 is made while user transactions against the first object (and against the second object if it 
10 existed prior to the command) are allowed to continue unimpeded by the execution of the DDL command. In particular, 
a "dirty copy" is made by accessing all the data records of the first object that are the subject of the DDL command, 
using "browse" access, and generating corresponding records in the second object 202 until the last of the relevant 
data records in the first object 200 have been accessed. 

[0041] Browse access is a form of read access that allows database records to be read through any record locks 
is that may be in place. Thus, browse access will sometimes result in the reading of records which are in the midst of 
being modified. Also, user transactions against the first object may delete or modify records in the first object while or 
after they are accessed by the first phase of the DDL command execution. As a result, the records created in the 
second object 202 may require a certain amount of correction. 

[0042] In the second phase of execution of the online DDL command, a partial clean-up of the records in the second 
20 object is performed by accessing records in the audit trail 204 associated with the first object 200 and performing 
corresponding "redo" operations against the second object. During this second phase user transactions against the 
first object (and against the second object if it existed prior to the command) are allowed to continue unimpeded by 
the execution of the DDL command. 

[0043] More specifically, before the dirty copy in the first phase of the online DDL command is executed, a pointer 

25 called AuditPtr is set to the end of the audit trail 204. During the second phase, each audit record beginning with the 
one referenced by the AuditPtr is inspected. If the audit record is relevant to operations performed on the first object 
200, or relevant to the subset of records of the first object that are the subject of the DDL command, then a redo 
operation is performed against the second object 202 based on the information in the audit record. In the case of a 
Create Index command, the redo operation uses new audit records that are generated based on the audit entries found 

30 in the audit trail because the second object has a very different structure than the first object; in the case of the other 
three online DDL commands, the redo operation uses the audit record as found in the audit trail except that the audit 
record is modified prior to the redo operation so as to reference the second object instead of the first object, and the 
redo operation is performed against the second object. 
[0044] The second phase continues until the end of the audit trail is reached. 

35 [0045] In the third phase, a lock is requested against the first object and second object (if it existed prior to the DDL 
command), and when that lock (or those locks) is (are) granted all user transactions other than browse access trans- 
actions are blocked until the third phase is completed. During the third phase, any new audit records in the audit trail 
204 associated with the first object 200 are accessed and corresponding "redo" operations are performed against the 
second object 202. Next, if the first object is a portion of a larger database table or index, a lock against the entire 

40 associated database table associated with the first object is obtained while catalog and file labels are updated. In 
particular, the catalog entries and the file labels associated with the first and second objects are updated to reflect the 
results of the DDL operation. Furthermore, the catalog entries and file labels of all objects which incorporate schema 
information made invalid by the DDL operation are also updated. Then the lock on the first and second objects (and 
the lock, if any, on the associated database table or index) is released, enabling user transactions against the first and 

45 second objects to resume. 

[0046] Finally, if the DDL command requires deletion of the first object or deletion of a range of records in the first 
object, that deletion operation is performed in a way that permits concurrent transaction activity. 
[0047] It is noted that the implementation of database reconfiguration operations using well established, optimized, 
database recovery process procedures makes the database reconfiguration operations efficient in terms of computation 

so time and in terms of the computer resources used. 

[0048] Appendix 1 lists a pseudocode representation of the Move Partition procedure used by the SQL catalog man- 
ager to move a database table partition from one disk to another 

[0049] Appendix 2 lists a pseudocode representation of the Split Partition procedure used by the SQL executor to 
split a database table partition into two partitions. 
55 [0050] Appendix 3 lists a pseudocode representation of the Move Partition Boundary procedure used by the SQL 
catalog manager to move rows (i.e., records) of a database table between adjacent partitions (i.e., partitions with 
adjacent ranges of the table's primary index). 

[0051] Appendix 4 lists a pseudocode representation of the Create Index procedure used by the SQL catalog man- 
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ager to create a new "alternate" index for a database table. 

[0052] Appendix 5 lists a pseudocode representation of an alternate embodiment of the Create Index procedure. 
[0053] Appendices 6, 7 and 8 lists pseudocode representations of procedures used by the Create Index procedure. 
[0054] The pseudocode used in Appendices 1 through 8 is, essentially, a computer language using universal com- 
5 puter language conventions. While the pseudocode employed here has been invented solely for the purposes of this 
description, it is designed to be easily understandable by any computer programmer skilled in the art. 

Move Partition Procedure 

w [0055] The function of the Move Partition procedure is to move a database table or index, or a specified partition of 
the table or index, from a first disk location to a second disk location. The procedure is executed in response to a Move 
Partition command. The second disk location is typically located on a different disk from the first, and often will be 
located on a different computer node from the first. For the purposes of explaining the Move Partition procedure, it will 
be assumed that a partition of a database table is being moved, with the understanding that the same steps would be 

15 performed for moving an entire unpardoned database file or for moving an index file or a partition of an index file. 
[0056] Referring to Figure 4 and Appendix 1 , the steps of the Move Partition procedure are as follows. The database 
partition to be moved, herein called the Old Partition, is accessed, and a new file for the new database partition, herein 
called the New Partition, is created on the appropriate disk volume. Furthermore, an audit trail pointer, AuditPtr, is set 
to point to where the next record in the audit trail for transactions against the database table associated with the Old 

20 Partition will be located. 

[0057] In the first phase (220) of the Move Partition procedure, the records in the Old Partition are accessed using 
Browse access (i.e., read through locks), and records are copied from the Old Partition to the New Partition. While 
records are created in the New Partition, an up-to-date primary index B-tree is maintained for the New Partition. To 
prevent further repetition, it is noted that for all records created and updated by the online DDL procedures, a corre- 

25 sponding B-tree is updated. 

[0058] Transactions by end users against the Old Partition are allowed to continue during this phase of the Move 
Partition procedure, and those transactions continue to insert, delete and update records in the Old Partition and create 
new audit trail entries in the audit trail. 

[0059] In the second phase (222) of the Move Partition procedure, transactions by end users againstthe Old Partition 
30 are allowed to continue, and those transactions insert, delete and update records in the Old Partition and create new 
audit trail entries in the audit trail. 

[0060] At the beginning of the second phase, an AuditTrail filter is established such that only Audit Trail records that 
pertain to the Old Partition are received for processing (by an Audit Fixup Process started by the SQL catalog manager). 
The processing of each audit record that passes the filter is called an "audit fixup". 

35 [0061] Each received audit record that passes the filter is initially processed by modifying the audit record to refer 
to the NewPartition. Next, the modified audit record is inspected to determine whether the change to the table noted 
in the audit record is already reflected in the copied records. If so, the audit record is ignored. Otherwise, a "redo" with 
respect to the modified audit record is performed so as to apply the change noted in the modified audit record to the 
NewPartition. This process is repeated for each received audit record that passes the filter until the end of the audit 

40 trail is reached. 

[0062] In the third phase (224) of the Move Partition procedure, a transaction request is made for a file lock on the 
Old Partition. This is a transaction request that goes in the lock queue. User transactions initiated before the third phase 
are not affected, but the lock request prevents users from committing transactions initiated after the lock request is 
made. When the requested lock is granted the remainder of the third phase is performed as a unitary transaction. The 

45 first part of the third phase transaction is performing the audit fixup process described above for all audit records 
referencing the Old Partition (i.e., that pass the filter) that were created after completion of the second phase. 
[0063] When processing of the last such audit record in the audit trail is completed, the third phase transaction 
requests a lock on the entire database table associated with the Old Partition. This prevents all user transactions on 
the entire database table until the lock is released. When the full table lock is granted, the catalog entry for the Old 

50 Partition is deleted and a new catalog entry for the New Partition is created. The file labels and catalog entries for all 
partitions of the database table are also updated so as to reference the New Partition. 

[0064] When all catalog and file label updates have been completed, the lock on the database table is released, 
enabling user transactions against the database table to resume. Finally, the Old Partition is deleted, the disk space 
used by the Old Partition file is released, and then the third phase transaction is concluded. 

55 

Split Partition Procedure 

[0065] The function of the Split Partition procedure is to move a portion of a database table, or a portion of specified 
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partition of a database table, from a first disk location to a second disk location. The procedure is executed in response 
to a Split Partition command. The second disk location is typically located on a different disk from the first, and often 
will be located on a different computer node from the first. For the purposes of explaining the Split Partition procedure, 
it win oe assumed that a partition of a database file is being split into two partitions, with the understanding that the 
5 same steps would be performed for splitting a previously unpardoned database file or for splitting an index file or a 
partition of an index file. 

[0066] Referring to Figure 5 and Appendix 2, the steps of the Split Partition procedure are as follows. The database 
partition to be split, herein called the Old Partition, is accessed, and a new file for the new database partition, herein 
called the New Partition, is created on the appropriate disk volume. Furthermore, an audit trail pointer, AuditPtr, is set 
10 to point to where the next record in the audit trail for transactions against the database table associated with the Old 
Partition will be located. 

[0067] The primary key ranges associated with the Old Partition and New Partition are as follows. The Old Partition's 
initial primary key range is called OldRange, and its new, reduced primary key range is called NewRangel . The primary 
key range for the New Partition is called NewRange2. 
15 [0068] In the first phase (230) of the Split Partition procedure, the records in the Old Partition having a primary key 
in NewRange2 are accessed using Browse access (i.e., read through locks), and are copied from the Old Partition to 
the New Partition. 

[0069] Transactions by end users against the Old Partition are allowed to continue during this phase of the Split 
Partition procedure, and those transactions insert, delete and update records in the Old Partition and create new audit 
20 trail entries in the audit trail. 

[0070] In the second phase (232) of the Split Partition procedure, transactions by end users against the Old Partition 
are allowed to continue, and those transactions insert, delete and update records in the Old Partition and create new 
audit trail entries in the audit trail. 

[0071] At the beginning of the second phase, an AuditTrail filter is established such that only Audit Trail records that 

25 pertain to the Old Partition with a primary key value in NewRange2 are received for processing. 

[0072] Each remaining received audit record is initially processed by modifying the audit record to refer to the New- 
Partition. Next, the modified audit record is inspected to determine whether the change to the table noted in the audit 
record is already reflected in the copied records. If so, the audit record is ignored. Otherwise, a "redo" with respect to 
the modified audit record is performed so as to apply the change noted in the modified audit record to the NewPartition . 

30 This process is repeated for each received audit record that passes the filter until the end of the audit trail is reached. 
[0073] In the third phase (234) of the Split Partition procedure, a transaction request is made for a lock on the Old 
Partition. This is a transaction request that goes in the lock queue. User transactions initiated before the third phase 
are not affected, but the lock request prevents users from committing transactions initiated after the lock request is 
made. When the requested lock is granted the remainder of the third phase is performed as a unitary transaction. The 

35 first part of the third phase transaction is performing the audit fixup process described above for all audit records 
referencing the Old Partition that were created after completion of the second phase and for which either the old record 
image or new record image has a primary key value in NewRange2 (i.e., that pass the filter). 
[0074] When processing of the last such audit record in the audit trail is completed, the B-tree for Old Partition is 
split into two B-trees, one for records having a primary key in NewRangel and the other for records having a primary 

40 key in NewRange2. 

[0075] Then the third phase transaction requests a lock on the entire database table associated with the Old Partition. 
This permits existing user transactions that work on the table to complete, but suspends all new user transactions on 
the entire database table until the lock is released. When the full table lock is granted, the catalog entry for the Old 
Partition is deleted and a catalog entry for the New Partition is created. The file labels and catalog entries for all partitions 

45 of the database table are also updated so as to reference the New Partition. 

[0076] When ail catalog and file label updates have been completed, the lock on the database table is released, 
enabling user transactions against the database table to resume. Finally, an access check is put in place to make 
application transactions unable to access records in the NewRange2 B-tree of the OldPartition, and then the third 
phase transaction is concluded. 

50 [0077] All records in the NewRange2 part of the Old Partition are deleted without interfering with concurrent appli- 
cation transactions against the table, and then the access check for the OldPartition is removed. 

Move Partition Boundary Procedure 

55 [0078] The function of the Move Partition Boundary procedure is to move records in a specified range of primary key 
values from a first specified partition of a database table to a second adjacent partition of that database table. The 
second partition is typically located on a different disk from the first, and often will be located on a different computer 
node from the first. The procedure is executed in response to a Move Partition Boundary command. For the purposes 
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of explaining the Move Partition Boundary procedure, it will be assumed that records are being moved between adjacent 
database file partitions, with the understanding that the same steps would be performed for moving records between 
adjacent index file partitions. 

[0079] Referring to Figure 6 and Appendix 3, the steps of the Move Partition Boundary procedure are as follows. 
5 The database partition from which records are to be moved is herein called Partitionl and the database partition to 
which those records are to be moved is called Partition2. An audit trail pointer, AuditPtr, is set to point to where the 
next record in the audit trail for transactions against the database table will be located. 

[0080] The primary key ranges associated with Partitionl and Partition2 are as follows. Partitionl 's initial primary key 
range is called OldRangel , and its new, reduced primary key range is called NewRangel . The initial primary key range 
10 for the Partition2 is called OldRange2 and its new expanded key range is called NewRange2. The primary key range 
of the records to be moved, called MoveRange, is equal to the range of primary key values in NewRange2 that are 
not in OldRange2. 

[0081] In the first phase (240) of the Move Partition Boundary procedure, the records in the Old Partition having a 
primary key in the MoveRange are accessed using Browse access (i.e., read through locks), and are copied from 
is Partitionl to Partition2. 

[0082] Transactions by end users against Partitionl and Partition2 are allowed to continue during this phase of the 
Move Partition Boundary procedure, and those transactions insert, delete and update records in Partitionl and 
Partition2 and create new audit trail entries in the audit trail. 

[0083] In the second phase (242) of the Move Partition Boundary procedure, transactions by end users against 
20 Partitionl and Partition2 are allowed to continue, and those transactions insert, delete and update records in Partitionl 
and Partition2 and create additional new audit trail entries in the audit trail. 

[0084] At the beginning of the second phase, an AuditTrail filter is established such that only Audit Trail records that 
pertain to the Partitionl are received for processing by the SQL catalog manager. Furthermore, those audit records for 
which neither the old record image (if any) nor the new record image (if any) have a primary key value in the MoveRange 
25 are also ignored. 

[0085] Each remaining received audit record is initially processed by modifying the audit record to refer to the 
Partition2. Next, the modified audit record is inspected to determine whether the change to the table noted in the audit 
record is already reflected in the copied records. If so, the audit record is ignored. Otherwise, a "redo" with respect to 
the modified audit record is performed so as to apply the change noted in the modified audit record to Partition2. This 

30 process is repeated for each received audit record that passes the filter until the end of the audit trail is reached. 

[0086] In the third phase (244) of the Move Partition Boundary procedure, a transaction request is made for a lock 
on the Old Partition. This is a transaction request that goes in the lock queue. User transactions initiated before the 
third phase are not affected, but the lock request prevents users from committing transactions initiated after the lock 
request is made. When the requested lock is granted the remainder of the third phase is performed as a unitary trans- 

35 action. The first part of the third phase transaction is performing the audit fixup process described above for all audit 
records referencing Partitionl that were created after completion of the second phase and for which either the old record 
image or new record image has a primary key value in the MoveRange (i.e., that pass the filter), 
[0087] When processing of the last such audit record in the audit trail is completed, the B-tree for Partitionl is split 
into two B-trees, one for records having a primary key in NewRangel and the other for records having a primary key 

40 jn the MoveRange. 

[0088] Then the third phase transaction requests a lock on the entire database table associated with Partitionl and 
Partition2. This permits existing user transactions that work on the table to complete, but suspends all new user trans- 
actions on the entire database table until the lock is released. When the full table lock is granted, the catalog entries 
and file labels for Partitionl and Partition2 are updated to reflect their new primary key ranges. Furthermore, the catalog 
45 entries for all partitions of the database table are also updated so as to reference the key ranges of Partitionl and 
Partition2. 

[0089] When all catalog and file label updates have been completed, the lock on the database table is released, 
enabling user transactions against the database table to be initiated and executed. Finally, an access check is put in 
place to make application transactions unable to access records in the MoveRange B-tree of the OldPartition, and then 
50 the third phase transaction is concluded. 

[0090] All records in the MoveRange part of the Old Partition are deleted without interfering with concurrent appli- 
cation transactions against the table, and then the access check for the OldPartition is removed. 



55 



Create Index Procedure 

[0091] The function of the Create Index procedure is to create an alternate index for a database table, where the 
alternate index is generally a different function of the database table columns than the primary index for the database 
table. 
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[0092] Referring to Figure 6 and Appendices 4, 5, 6 and 7, the steps of the Create Index procedure are as follows. 
The database object from which records are to be indexed is herein called the Base Table and the alternate index 
being created is called the Newindex. The set of columns used to generate the alternate key for the Newlndex are 
represented by a vector called the AltemateKeyColumns vector. Furthermore, a flag called UniqueindexFlag is set to 
5 True if the Newlndex is to be a unique index, which means that every record in the Base Table must have a unique 
alternate index value, and is set to False if multiple records in the Base Table are allowed to have the same alternate 
index value. 

[0093] An audit trail pointer, AuditPtr, is set to point to where the next record in the audit trail for transactions against 
the database table will be located. 
w [0094] In the first phase (250) of the Create Index procedure, the records in the Base Table are accessed using 
Browse access (i.e., read through locks), and for each such record a "create new index record" procedure 251 is 
performed. The "create new index record" procedure 251 , which is also used in the later phases of the Create Index 
procedure, begins by computing primary and alternate index values for a particular record using the index generation 
functions: 

15 

PrimaryKeyl = Create_PrimaryKey(Record, PrimaryKeyColumns) 



20 AltKeyl = Create_AltKey(Record, PrimaryKeyColumns) 

[0095] If the UniqueindexFlag is False, an index record with AltKeyl and PrimaryKeyl as its two fields is stored in 
the Newlndex. 

[0096] If the UniqueindexFlag is set to True, prior to storing an index record with AltKeyl and PrimaryKeyl as its 

25 two fields in the Newlndex, the Newlndex is searched to see if Newlndex already has a record with an alternate index 
value of AltKeyl. If such a record is found in the Newlndex, a potential duplicate index problem exists. To resolve 
whether a duplicate index problem has been encountered, two tests are performed. First, a repeatable read transaction 
is performed on the Base Table to see if the Base Table still stores a record with primary and alternate key values of 
PrimaryKeyl and AltKeyl . If such a record is found in the Base Table, the second test is performed. 

so [0097] The existing Newlndex record with an alternate key value of AltKeyl is read (with a repeatable read operation) 
to determine the primary key value, PrimaryKey2, stored with it. If this Newlndex record still exists, a second repeatable 
read transaction is performed on the Base Table to see if the Base Table stores a record with a primary key value of 
PrimaryKey2 and an alternate key value of AltKeyl . If both tests produce positive results, the Create Index procedure 
is aborted because two database records with identical alternate key values have been encountered. Otherwise, if 

35 either test returns a negative result, an index record with AltKeyl and PrimaryKeyl as its two fields is stored in the 
Newlndex, and the index record with AltKeyl and PrimaryKey2 (if any) is deleted from the Newlndex. 
[0098] In the second phase (252) of the Create Index procedure, transactions by end users against the Base Table 
are allowed to continue, and those transactions insert, delete and update records in the Base Table and create additional 
new audit trail entries in the audit trail. 

40 [0099] At the beginning of the second phase, an AuditTrail filter is established such that only Audit Trail records that 
pertain to the Base Table are received for processing by the SQL catalog manager. Each received audit record is 
processed by an "audit fixup for create index" procedure 253 specially designed for use as part of the Create Index 
procedure. 

[0100] The "audit fixup for create index" procedure 253 processes an audit record as follows. If the audit record 
45 denotes a change to a Base Table record that does not alter either the primary key or the alternate key of the record, 
or the change is already reflected in the Newlndex, the audit record is ignored. 

[0101] If the audit record denotes deletion of a Base Table record, an audit record representing deletion of the as- 
sociated Newlndex record is constructed and a "redo" of the newly created audit record against the Newlndex is per- 
formed. As a result, the associated Newlndex record is deleted, if one exists. 

50 [0102] If the audit record denotes addition of a Base Table record, an audit record representing insertion of a corre- 
sponding Newlndex record is constructed and a "redo" of the newly created audit record against the Newlndex is 
performed. If a unique alternate index is being created and a duplicate Newlndex record with the same primary and 
alternate key values is found during this procedure, the index creation procedure is aborted. Otherwise, the aforemen- 
tioned steps result in a new index record being stored in the Newlndex for the inserted Base Table record. 

55 [0103] Finally, if the audit record indicates that either the primary index or alternate index of a record have been 
altered, then a first audit record representing deletion of the Newlndex record associated with the record's old value 
is constructed, and a second audit record representing insertion of a new Newlndex record corresponding to the record's 
new value is constructed. Redo's of both constructed audit records are then performed against the Newlndex. If a 
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unique alternate index is being created and a duplicate Newlndex record with the same primary and alternate key 
values is found during this procedure, the index creation procedure is aborted. Otherwise, the aforementioned steps 
result in a new index record being stored in the Newlndex for the altered Base Table record, arid deletion of the prior 
Newinoex record Tor that Base I able record. 

[0104] In the third phase (254) of the Create Index procedure, a transaction request is made for a lock on the Base 
Table. This is a transaction request that goes in the lock queue. User transactions initiated before the third phase are 
not affected, but the lock request prevents users from committing transactions initiated after the lock request is made. 
When the requested lock is granted the remainder of the third phase is performed as a unitary transaction. The first 
part of the third phase transaction is performing the "audit fixup for create index" process 253 described above for all 
audit records referencing the Base Table that were created after completion of the second phase (i.e., that pass the 
filter). 

[01 05] When processing of the last such audit record in the audit trail is completed, a catalog entry for the Newlndex 
is created, and file labels for all partitions of the Base Table are updated to reflect the existence of the Newlndex. Then 
the lock on the Base Table is released, the third phase transaction is concluded, and the Create Index procedure is 
completed. 

[0106] In an alternate embodiment, represented in Appendix 5, the first phase of the Create Index procedure utilizes 
an indexing procedure that processes the entire Base Table, even if records that potentially have duplicate alternate 
key values are encountered. All such potential duplicate records are copied to a scratch pad memory area, and then 
are processed by the "create new index record" procedure 251 to determine if records with duplicate alternate key 
values in fact exist, and to add additional records to the Newlndex if no duplicate alternate key values are found. 

ALTERNATE EMBODIMENTS 

[01 07] While the present invention has been described with reference to a few specific embodiments, the description 
is illustrative of the invention and is not to be construed as limiting the invention. Various modifications may occur to 
those skilled in the art without departing from the scope of the invention as defined by the appended claims. 
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APPENDIX 1 

Pseudocode Representation of Move Partition Procedure 

Procedure MovePartition (OldPartition. NewPartition) 
r OldPartition identifies the partition to be moved 

NewPartition identifies the new partition to be created 

BaseTable is the database table associated with the OldPartition 7 

{ 

/•Phase 1: Dirty Copy*/ 

Access current database partition (OldPartition) to be moved 
Create file for new database partition (NewPartition) 
Create file label for the NewPartition 
AuditPtr = End of Audit Trail 

T AuditPtr initially points to where the next record In the Audit Trail will 
be located, when it is generated V 

While accessing records in the OldPartition, using Browse access (i.e., read 
through locks), copy records from the OldPartition to the NewPartition 

Maintain an up-to-date primary index B-tree for each record copied into the 
NewPartition 

All transactions by end users continue to insert, delete and update records in 
the OldPartition and to create new audit trail entries 

P Phase 2: Partial Cleanup V 

Establish AuditTrail filter Access only Audit Trail records that pertain to the 

OldPartition 
Do Until End of Audit Trail is reached 

{ 

f Process entry in Audit Trail at AuditPtr 7 

Modify the Audit Record to refer to the NewPartition 

If the change to the table noted in the audit record is already reflected in 

the copied records 

{ Ignore Audit Record } 
Else 

{ 

Perform a "Redo" with respect to the modified Audit Record so as to 
apply the change noted in the Audit Record to the NewPartition 
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Update B-tree for the NewPartition for each update to the 
NewPartition 

} 

Update AuditPtr to Next Audit Record (skip records that do not pertain to 

the OldPartition) 
} r End Do Loop 7 

P Phase 3: Final Cleanup 7 

Request and Wait for Lock on the OldPartition 

/* This is a transaction request that goes in the lock queue. This does not 
affect transactions initiated before Phase 3 9 but prevents users from 
committing transactions initiated after the Lock Request is made. 7 

r When Lock is granted ... 7 

Do As A Unitary Transaction 

{ 

LastPtr s Last Record in Audit Trail 

/* Previously established Audit Trail Filter continued to filter out audit 

records not applicable to the OldPartition 7 
Do Until End of AuditPtr reaches LastPtr 

{ 

Modify the Audit Record to refer to the NewPartition 

Perform a "Redo" with respect to the modified Audit Record so as to 

apply the change noted in the Audit Record to the NewPartition 
Update B-tree for the NewPartition for each update to the 

NewPartition 

Update AuditPtr to Next Audit Record (skip records that do not 

pertain to the OldPartition) 
} r End Do Loop 7 

Request Lock on the entire Base Table 
When Lock is granted: 

Delete Catalog entry for the OldPartition 

Create Catalog entry for the NewPartition 

Update file labels and catalog entries for all partitions of BaseTable 
so as to reference the NewPartition 

Release Lock on BaseTable, enabling user transactions against the 
Base Table to resume execution 
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Delete the OldPartition and release disk space used by the OldPartition 
file 

} r End of transaction */ 
Return 
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APPENDIX 2 

Pseudocode Representation of Split Partition Procedure 

Procedure SplitPartition (BaseTable, OldPartition, OldRange, NewRangel, 
NewRange2, NewPartition) 

r OldPartition identifies the partition to be split 

NewPartition identifies the new partition to be created 

OldRange is the range of Primary Key values currently assigned to 

the OldPartition 

NewRangel is the new range of Primary Key values assigned to the 
OldPartition 

NewRange2 is the range of Primary Key values assigned to the 
NewPartition to be created 

BaseTable is the database table associated with the OldPartition 

7 

Access current database partition (OldPartition) to be split 
Create file for new database partition (NewPartition) 
Create file label for the NewPartition 
AuditPtr = End of Audit Trail 

r AuditPtr initially points to where the next record in the Audit Trail will 
be located, when it is generated 7 

r Phase 1: Dirty Copy*/ 

While accessing records in the OldPartition, using Browse access (i.e., read 
through locks), copy records having a Primary Key value in NewRange2 
from the OldPartition to the NewPartition 

Maintain an up-to-date primary index B-tree for each record copied into the 
NewPartition 

All transactions by end users continue to insert, delete and update records in 
the OldPartition and to create new audit trail entries 

r Phase 2: Partial Cleanup 7 

Establish AudHTrail filter Access only Audit Trail records that pertain to the 

OldPartition with a Primary Key value in NewRange2 
Do Until End of Audit Trail is reached 

{ 
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r Process entry in Audit Trail at AuditPtr 7 
Modify the Audit Record to refer io rartiiiun2 
If the change to the table noted in the audit record is not already 
reflected in the copied records 

{ 

Perform a "Redo" with respect to the modified Audit Record so as to 

apply the change noted in the Audit Record to Partition2 
Update B-tree for Partition2 for each update to Partition2 

} 

Update AuditPtr to Next Audit Record (skip records that do not pertain to 

Partition 1 or with a Primary Key value not in NewRange2) 
} r End Do Loop 7 

f* Phase 3: Final Cleanup 7 
Request Lock on the OldParlition 

r This is a transaction request that goes in the lock queue. This does not 
affect transactions initiated before Phase 3, but prevents users from 
committing transactions initiated after the Lock Request is made. 7 

P When Lock is granted ... 7 

Do as a Unitary Transaction: 
{ 

LastPtr 55 Last Record in Audit Trail 

r Previously established Audit Trail Filter continued to filter out audit 

records not applicable to NewRange2 in the OldPartition 7 
Do Until End of AuditPtr reaches LastPtr 

{ 

Modify the Audit Record to refer to the NewPartition 

Perfoim a "Redo* with respect to the modified Audit Record so as to 

apply the change noted in the Audit Record to the NewPartition 
Update B-tree for the NewPartition for each update to the 

NewPartition 

Update AuditPtr to Next Audit Record (skip records that do not 
pertain to NewRange2 in the OldPartition) 
} 

} r End Do Loop 7 
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" " OVWl ww 



Sow the B-tree for Partitionl into two B-trees. one for records .n 
" NewRangel (those remaining in Partitionl) and one ,of *~ . 
in NewRange2 to be removed from Partitionl (the Prune B-tree). 

Request Lock on the entire Base Table 
When Lock is granted: 

Update Catalog entry for the OldPartrtion to reflect new index 

boundaries for the OldPartition 
Update label for the OldPartition, including putting in place an 
access check that prevents user transaction from accessing 
NewRange2 
Create Catalog entry for the NewPartition 
Update file labels and catalog entries for all partitions of BaseTable 
to reference the NewPartition 

Release Lock on BaseTable, enabling user transactions against the 
Base Table to resume execution 

} / * end of unitary transaction for third phase 7 

r Phase 4: Background garbage collection */ 

In background mode: 

Delete all records from the OldPartition that are referenced by the Prune 
B-tree (this releases unused space from the OldPartrt.on file) 

Remove access check in the OldPartition 
Return 
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APPENDIX 3 

Pseudocode Representation of Move Partition Boundary Procedure 

Procedure MovePartitionBoundary (Partition*!, OldRange1 t NewRangel, 
Partition^ OldRange2, NewRange2) 

r BaseTable is database table associated with Partitionl and 
Partiiion2 

Partitionl and Partition2 identify two partitions having adjacent 
assigned Primary Key Ranges 

OldRangel is the range of Primary Key values currently assigned to 
Partitionl 

OldRange2 is the range of Primary Key values currently assigned to 
Partition2 

NewRangel is the new range of Primary Key values assigned to 
Partitionl 

NewRange2 is the new range of Primary Key values assigned to 
Partition2 

7 

Access database Partitionl and Partition2 
AuditPtr = End of Audit Trail 

r AuditPtr initially points to where the next record in the Audit Trail will 
be located, when it is generated 7 
Generate MoveRange = range of Primary Key values for records in 

NewRange2 that are not in OldRange2 

r Phase 1: Dirty Copy 7 

While accessing records in Partitionl using Browse access, copy records 

having a Primary Key value in MoveRange from Partitionl to Partition2 
Update primary index B-tree for Partition2 for each record copied into 

Partitions Leave B-tree for Partitionl unchanged. 
AH transactions by end users continue (A) to insert, delete and update 

records in Partitionl and Partition2 using old partition range boundaries 

and (B) to create new audit trail entries 
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I* Phase 2: Partial Cleanup 7 

Establish AuditTra!! filter. Access onh* Audit Trail records that pertain to 

Partition! with Primary Key values in MoveRange 
Do Until End of Audit Trail is reached 

{ 

f Process entry in Audit Trail at AuditPtr 7 

Modify the Audit Record to refer to Partition2 

If the change to the table noted in the audit record is not already 

reflected in the copied records 

{ 

Perform a "Redo" with respect to the modified Audit Record so as to 

apply the change noted in the Audit Record to Partition2 
Update B-tree for Partition2 for each update to Partition2 

} 

Update AuditPtr to Next Audit Record (skip records that do not pertain to 

Partition! or with a Primary Key value not in MoveRange) 
} r End Do Loop 7 

r Phase 3: Final Cleanup 7 

Request Lock on Partition 1 and Partition2 

r This is a transaction request that goes in the lock queue. This does not 
affect transactions initiated before Phase 3, but prevents users from 
committing transactions initiated after the Lock Request is made. V 

/* When Lock is granted ... 7 

Do as a Unitary Transaction: 

{ 

LastPtr = Last Record in Audit Trail 

r Previously established Audit Trail Filter continued to filter out audit 
records not applicable to MoveRange in Partition! 7 

Do Until End of AuditPtr reaches LastPtr 
{ 

Modify the Audit Record to refer to Partition2 

Perform a "Redo - with respect to the modified Audit Record so as to 
apply the change noted in the Audit Record to Partition2 
Update B-tree for Partition2 for each update to Partition2 

Update AuditPtr to Next Audit Record (skip records that do not 
pertain to MoveRange in Partition!) 

} r End Do Loop 7 
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Split the B-tree for Partitionl into two B-trees f one for records in 

NewRangel (those remaining in Partitionl) and one for the records 
in the MoveRange to be removed from Partitionl (the Prune B-tree). 

Request Lock on entire Base Table 
When Lock is granted: 

Update Catalog entry for Partitionl to reflect new partition key 

boundaries for Partitionl 
Update label for Partitionl. including putting in place an access 
check that prevents user transactions from accessing 
MoveRange 

Update Catalog entry for Partition2 to reflect new partition key 

boundaries for Partition2 
Update label for Partition2 

Update file labels and catalog entries for all partitions of BaseTable 
to reflect the new partition key boundaries for Partitionl and 
Partition2 

Release Locks on BaseTable (including Partitionl and Partition2) 9 
enabling user transactions against the Base Table to resume 
execution 

} / * end of unitary transaction for third phase 7 

r Phase 4: Background garbage collection 7 
In background mode: 

Delete all records from Partitionl that are referenced by the Prune 
B-tree (this releases unused space from Partitionl file). 

Remove access check in Partitionl 
Return 
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Pseudocode Representation of Create Index Procedure 

Createlndex Procedure (BaseTable, AltlndexColumns, UniquelndexFlag, 
Newlndex) 

/* BaseTable is table for which index is to be created. 

AltlndexColumns is a vector listing the table columns that define the 
key for the New Index. 

UniquelndexFlag is equal to True if the new index must be a unique 
index (i.e., an index for which every base table record has a unique 
alternate key value). 

Newlndex is the new index to be created 
PrimarylndexColumns is a vector listing the table columns that 
define the key for the Primary Index of the Base Table 

•/ 

I* Structure of Newlndex (and every other index, other than the primary 
index) is: 

File Label - embedded copy of catalog information for index 
B-tree - the B-tree for the index 
Index Records, where each index record has two fields: 
AltKey, PrimaryKey 

where the AltKey field stores the value of the AltlndexColumns 
for one record of the BaseTable, and the PrimaryKey field 
stores the value of the primary key (i.e., the 
PrimarylndexColumns) for that same record 

7 

Create File for Newlndex, including root node for the new index's B-tree 
Create file label for Newlndex 
AudrtPtr = End of Audit Trail 

/* AuditPtr initially points to where the next record in the Audit Trail will 
be located, when it is generated 7 
RecPtr = First Record of BaseTable 
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r Phase 1: Dirty Copy */ 

Do Until Fnd of Bas«Table is Reached 

{ 

Access, using Browse access, BaseTable(RecPtr) 
AltKeyl * Create_AHKey( BaseTable(RecPtr), AltlndexColumns) 
PrimaryKeyl = Create_PrimaryKey( BaseTable(RecPtr), 
PrimarylndexColumns ) 

/* Check for Duplicate Index Values only if UniquelndexFlag is True*/ 
If UniquelndexFlag 

{ 

Search Newlndex for an index record with 

Newlndex.AltKey equal to AltKeyl 
If a match is found 

{ 

Call CheckDupRecord (BaseTable, Newlndex, 

BaseTable(RecPtr)) 
If RetumCode = 'fail" 

Abort Createlndex Procedure 

} 

Else 

Call CreatelndexRecord (Newlndex, AltKeyl, PrimaryKeyl) 

} 

Else / * UniquelndexFlag is False */ 

Call CreatelndexRecord (Newlndex, AltKeyl, PrimaryKeyl) 

Increment RecPtr to point to next BaseTable record, if any 
} I* end of Do loop 7 

r Phase 2: Partial Cleanup V 

Establish AuditTrail filter Access only Audit Trail records that pertain to the 

Base Table 
Do Until End of Audit Trail is reached 

{ 

Call AuditRecord_to_Newlndex (AuditPtr, BaseTable, Newlndex, 

IndexPtr) for the current Audit Record 
Update AuditPtr to Next Audit Record (skip records that do not pertain to 

the Base Table) 

} 
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P Phase 3: Final Cleanup 7 
Request Lock on the Base Table 

I* This is a transaction request that goes in the lock queue. This does not 
affect transactions initiated before Phase 3, but prevents users from 
committing transactions initiated after the Lock Request is made. */ 

I* When Lock is granted ... 7 

Do as a Unitary Transaction: 

{ 

LastPtr = Last Record in Audit Trail 

r Previously established Audit Trail Filter continued to filter out audit 

records not applicable to the Base Table 7 
Do Until End of AuditPtr a LastPtr 

{ 

Call AuditRecord_to_Newlndex (AuditPtr, BaseTable, Newlndex) for 

the current Audit Record 
Update AuditPtr to Next Audit Record (skip records that do not 

pertain to the Base Table) 
} r End Do Loop 7 

Create Catalog entry for Newlndex 

Update file labels and catalog entries for all partitions of BaseTable to 
reference the Newlndex 

Release Lock on BaseTable, enabling user transactions against the 

Base Table to resume execution 
} / * end of unitary transaction for third phase 7 

Return 
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APPENDIX 5 

Pseudocode for Second Preferred Embodiment 
of Phase 1 of Create Index Procedure 

r Phase 1: Dirty Copy 7 
Initialize Scratchpad pointer SP_Ptr 
Do Until End of BaseTable is Reached 
{ 

Access, using Browse access, BaseTable(RecPtr) 
AltKeyl = Create_AltKey( BaseTable(RecPtr), AltlndexColumns) 
PrimaryKeyl = Create_PrimaryKey( BaseTable(RecRr), 
PrimarylndexColumns ) 

f Check for Duplicate Index Values only if Uniquelndex is True*/ 
If UniquelndexFlag 
{ 

Search Newlndex for an index record with 

Newlndex.AitKey equal to AltKeyl 
If a match is found 

{ 

Store RecPtr in ScratchPad(SP_Ptr) 

Increment SP_Ptr 

} 

Else 

Call CreatelndexRecord (Newlndex, AltKeyl, PrimaryKeyl) 

} 

Else / * UniquelndexFlag is False 7 

Call CreatelndexRecord (Newlndex, AltKeyl, PrimaryKeyl) 

Increment RecPtr to point to next BaseTable record, if any 
} /* end of Do loop 7 

I* Process Records in Scratchpad */ 
Initialize Scratchpad pointer SP_Ptr 
Do Until End of Scratchpad is reached 
{ 

RecPtr = ScratchPad(SP_Ptr) 

Call CheckDupRecord (BaseTable, Newlndex, BaseTable(RecPtr)) 
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If RetumCode = fall" 

Abort Createlndex Procedure 
Increment SP_Ptr to point to next Scratchpad entry, If any 
} 

r end of Phase 1 7 



APPENDIX 6 



Pseudocode Representation of Create Index Record Procedure 

Procedure CreatelndexRecord (Newlndex, NewAttKey, NewPrimaryKey) 

I* Newlndex is the new index in which a new record is to be created. 
NewAttKey is the alternate key of a record in the BaseTable for 

which an index record is to be created. 
NewPrimaryKey is the primary key of a record in the BaseTable for 
which an index record is to be created. 

*/ 



Create NewlndexRecord(AItKey.PrimaryKey): 
NewlndexRecordAltKey ■ NewAltKey 
NewlndexRecord.PrimaryKey = NewPrimaryKey 

If NewlndexRecord is already present in Newlndex 
Return 

Else 
{ 

Store NewlndexRecord in Newlndex 
Add B-tree entry for the NewlndexRecord 
} 



Return 
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APPENDIX 7 

Pseudocode Representation of Check Duplicate Record Procedure 

Procedure CheckDupRecord (BaseTable, Newlndex, DupRecord) 

r BaseTable is a database table 

Newlndex is an index file being created for the BaseTable 
DupRecord Is an index record whose AltKey value is the same as 
the AltKey value of an already existing index record 

7 

I* Generate alternate key value and primary key value from DupRecord 7 
AltKeyValue2 = Create_AltKey( DupRecord, AltlndexColumns) 
PrimaryKeyValue2 = Create_PrimaryKey(DupRecord, PrimarylndexColumns) 

As a unitary transaction: 
{ 

Access the base table record, if any, having a primary key value of 
PrimaryKeyValue2, requesting a lock on the requested record 

If Record=BaseTable(PrimaryKeyVaIue2) is found in base table 
{ 

r Extract the alternate key value for that base record 7 
AltKeyValueY ■ Record(PrimaryKeyValue2).AItKey 
If AltKeyValueY * AltKeyValue2 

TransactionReturn ("DupRecord not found") 

} 

Else 

TransactionReturn ("DupRecord not found") 

} 

If TransactionReturn = "DupRecord not found" 

F A Base Table record for DupRecord was not found, indicating that 
an index record for DupRecord is not needed. 7 

{ 

Return (Success) 
} 

r A base table record matching DupRecord's Primary and Alternate keys 
was found in base table. 
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First step: Read the already existing Newlndex record whose alternate 
key value is DupRecord.AltKey. 

Next step: Check to see if a record with the primary key value found in 
this Newlndex record is still in Base Table 

*/ 

PrimaryKeyValuel ■ Newlndex(AltKeyValue2).PrimaryKey 
As a unitary transaction: 
{ 

Access the base table record, if any, having a primary key value of 
PrimaryKeyValuel, requesting a lock on the requested record 

If record is found in base table 
{ 

I* Extract the alternate key value for that base record 7 
AHKeyValueX = BaseTable(PrimaryKeyValue1).AItKey 
If AltKeyValueX * AltKeyValue2 

TransactionRetum ("conflicting record not found") 

} 

Else 

TransactionRetum ("conflicting record not found") 

} 

If TransactionRetum = "conflicting record not found" 
{ 

1* No Duplicate Record Problem because original record has been 

deleted or altered 7 
Oelete Newtndex(AltKeyValue2) 

Call CreatelndexRecord (Newlndex, AhKey2, PrimaiyKey2) 

Return (Success) 

} 

Else 
{ 

I* a conflicting base record was found 7 
Return (Fail, "Duplicate key value found") 
} 

1* End of CheckDupRecord Procedure 7 
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Pseudocode Representation of 'Audit Record to Newlndex" Procedure 

Procedure AuditRecordJo_Newlndex (AuditPtr, BaseTable, Newlndex) 
r BaseTable is database table 
Newlndex is index file 
AuditPtr is a pointer to an Audit Trail record 

7 

r Ignore irrelevant Audit Record 7 

If Audit Record denotes a change to a base table record that does not 
alter either the PrimaryKey or AitKey of the record 
{ Ignore Audit Record } 

P Record deleted from Base Table 7 

Elself Audit Record denotes deletion of a base table record 

{ 

Create an audit record representing deletion of the associated 
Newlndex record 

If this record exists in Newlndex (i.e., matching both the PrimaryKey 
and AitKey of the deleted base table record) 

Redo the newly created audit record against Newlndex 

} 

P New Record Added to Base Table 7 

Elself Audit Record denotes addition of a record to the base table 
{ 

Create an audit record representing insertion of the associated 

Newlndex record 
If UniquelndexFlag 

{ 

AltKeyl = Create_AKKey( Inserted Record, AWndexColumns) 
Search Newlndex for an index record with 

Newlndex.AltKey equal to AltKeyl 
If a match is found 

{ 

Call CheckDupRecord (BaseTable, Newlndex, Inserted 
Record) 
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If RetumCode = "fail* 

Abort Createlndex Procedure 

} 

Else 

Redo the newly created audit record against Newlndex 

} 

Else / * UniquelndexFlag is False V 

Redo the newly created audit record against Newlndex 

} 

r PrlmaryKey and/or AltKey In Base Table record was altered 7 

Elself audit record denotes alteration of a base table record 

{ 

Create a first audit record representing deletion of the Newlndex 
record associated with the base table record's old value 

If this record exists in Newlndex (i.e., matching both the PrimaryKey 

and AltKey of the base table record's old value) 

Redo the first newly created audit record against Newlndex 

Create a second audit record representing insertion of the Newlndex 
record associated with the base table record's new (altered) 
value 

If UniquelndexFlag 
{ 

AltKeyl = Create_AltKey( Altered Record, AltlndexColumns) 
Search Newlndex for an index record with 

Newlndex.MKey equal to AltKeyl 
If a match is found 

{ 

Call CheckDupRecord (BaseTable, Newlndex, Altered 

Record) 
If RetumCode ■ ■fail" 

Abort Createlndex Procedure 

} 

Else 

Redo the second newly created audit record against 
Newlndex 

} 

Else / * UniquelndexFlag is False 7 
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Redo the second newly created audit record against Newlndex 
} r end of processing for "table alteration" audit record 

Return 



10 

Claims 

1 . A computer system for storing and providing user access to data in stored database objects, comprising: 

'5 memory (11 6) for storing said database objects (120, 200), said database objects comprising database tables, 

database partitions and database indices wherein said memory resides in a plurality of interconnected com- 
puter nodes (102); 
a central processing unit (110); 

a transaction manager (134), coupled to said memory, for managing computational transactions that add, 
20 delete and alter data stored in said database objects; said transaction manager (134) including audit trail 

generation instructions for generating an audit trail (138) and storing said audit trail in said memory, said audit 
trail including audit records, at least a subset of said audit records each denoting an event selected from the 
set consisting essentially of addition, deletion and alteration of specified data in a specified one of said database 
objects; 

25 a restructuring procedure, executable by said cental processing unit, for restructuring a specified one of said 

database objects; 

characterized in that said restructuring procedure includes: 

30 first phase instructions for accessing data in a first of said objects and storing corresponding data in a second 

of said objects while allowing continued performance of computational transactions against said first object; 
second phase instructions, for execution after said first phase instructions, for accessing said audit records in 
said audit trail created by said transaction manager during execution of said first phase instructions and there- 
after until execution of said second phase instructions is completed; said second phase instructions updating 

35 said data stored in said second object by redoing with respect to said second object each event denoted by 

said accessed audit trail records; and 

third phase instructions, for execution after said second phase instructions, for obtaining a lock on said first 
object so as to prevent continued performance of computational transactions against said first object, and for 
then accessing audit records in said audit trail created by said transaction manager after execution of said 
40 second phase instructions; said third phase instructions updating said data in said second object by redoing 

with respect to said second object each event denoted by said accessed audit trail records. 

2. The computer system of claim 1 , further characterized by: 

45 at least one of said database tables having a plurality of columns, a plurality of records having primary key 

values based on a first specified subset of said columns, and a primary key index based on said primary key 
values; 

said restructuring procedure comprising a create index procedure for creating an alternate key index having 
a unique alternate key value for each record; 
so said first object comprising one of said database tables and said second object comprising an alternate key 

index generated by said create index procedure; 

said first phase instructions including instructions for accessing said records in said one database table and 
instructions for generating and storing a corresponding index record in said alternate key index, each record 
in said alternate key index including said primary key value for said corresponding database table record and 
55 an alternate key value based on a second specified subset of said columns; 

said second phase and third phase instructions including instructions for deleting an index record in said 
alternate key index when one of said accessed audit records indicates an event making said alternate key 
index record invalid, and instructions for adding index records to said alternate key index when said accessed 
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audit records indicate events that add new database records and events that alter any field of an existing 
database record corresponding to any column in said first and second specified subsets of columns; and 
said generating and storing instructions in said first phase instructions, and said adding index records instruc- 
iiuns in said second phase and ihira pnase instructions including instructions for determining whether two 
records in said one database table have identical alternate key index values and for aborting said create index 
procedure when two such records are found. 

The computer system of claim 1 , further characterized by: 

at least a subset of said database tables each including a plurality of partitions, each partition storing records 
having primary key values in a primary key range distinct from all others of said plurality of partitions; each 
partition stored as a separate data structure in said memory; 

said restructuring procedure comprising a move partition procedure for moving any specified one of said par- 
titions for one of said database tables having a plurality of partitions from a first memory location to a second 
memory location; 

said first phase instructions comprising instructions for creating a new partition at said second memory location 
and for copying all records of said specified partition into said new partition, using read only access to said 
records of said specified partition while performing said copying; 

said second phase instructions update said records in said new partition so as to redo each database table 
record event denoted by said accessed audit trail records; and 

said third phase instructions comprising instructions, for execution after said second phase instructions, for 
obtaining a lock on said one database table, then accessing audit records in said audit trail created by said 
transaction manager after execution of said second phase instructions; said third phase instructions updating 
said records in said new partition so as to redo each database table record event denoted by said accessed 
audit trail records. 

The computer system of claim 3, further characterized by: 

said one database table having an associated schema denoting information about said one database table's 
columns; and 

said third phase instructions including instructions for updating said schema information for said one database 
table to indicate said new partition's memory location, and then releasing said lock on said one database table. 

A method of storing and providing user access to data in stored database objects, comprising the steps of: 

storing said database objects (120, 200) in memory (116), wherein said memory resides in a plurality of inter- 
connected computer nodes (1 02) and said database objects comprising database tables, partitions and indi- 
ces; 

while performing computational transactions that add, delete and alter data stored in said database objects, 
generating an audit trail and storing said audit trail in said memory, said audit trail including audit records, at 
least a subset of said audit records each denoting an event selected from the set consisting essentially of 
addition, deletion and alteration of specified data in a specified one of said database objects; and 
restructuring a specified one of said database objects, 

characterized in that said restructuring step includes: 

in a first phase, accessing data in a first of said objects and storing corresponding data in a second of said 
objects while allowing continued performance of computational transactions against said first object; 
in a second phase, accessing said audit records in said audit trail created during said first phase and thereafter 
until performance of said second phase is completed, and updating said data stored in said second object by 
redoing with respect to said second object each event denoted by said accessed audit trail records; and 
in a third phase, obtaining a lock on said first object so as to prevent continued performance of computational 
transactions against said first object, and then accessing audit records in said audit trail generated after said 
second phase and updating said data in said second object by redoing with respect to said second object 
each event denoted by said accessed audit trail records. 

The method of claim 5, further characterized by: 
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at least one of said database tables having a plurality of columns, a plurality of records having primary key 
values based on a first specified subset of said columns, and a primary key index based on said primary key 
values; 

said restructuring step comprising a step for creating an alternate key index having a unique alternate key 
value for each record; 

said first object comprising one of said database tables and said second object comprising said alternate key 
index generated by said create index step; 
said restructuring step including: 

in said first phase, accessing said records in said one database table and generating and storing a cor- 
responding index record in said alternate key index, each record in said alternate key index including said 
primary key value for said corresponding database table record and an alternate key value based on a 
second specified subset of said columns; 

in said second phase and said third phase, deleting an index record in said alternate key index when one 
of said accessed audit records indicates an event making said alternate key index record invalid, and 
adding index records to said alternate key index when said accessed audit records indicate events that 
add new database records and events that alter any field of an existing database record corresponding 
to any column in said first and second specified subsets of columns; and 

determining whether two records in said one database table have identical alternate key index values and 
aborting said restructuring step when two such records are found. 

The method of claim 5, further characterized by: 

at least a subset of said database tables each including a plurality of partitions, each partition storing records 
having primary key values in a primary key range distinct from ail others of said plurality of partitions; each 
partition stored as a separate data structure in said memory; 

said restructuring step including moving any specified one of said partitions for one of said database tables 
having a plurality of partitions from a first memory location to a second memory location; said moving step 
including: 

in said first phase, creating a new partition at said second memory location and copying all records of said 
specified partition into said new partition, using read only access to said records of said specified partition 
while performing said copying; 

in said second phase, accessing audit records in said audit trail created during execution of said first 
phase and thereafter until performance of said second phase is completed, and updating said records in 
said new partition so as to redo each database table record event denoted by said accessed audit trail 
records; and 

in said third phase, obtaining a lock on said one database table, then accessing audit records in said audit 
trail created after completion of said second phase, and updating said records in said new partition so as 
to redo each database table record event denoted by said accessed audit trail records; said third phase 
further including releasing said lock on said one database table. 

The method of claim 7, wherein each database table has an associated schema denoting information about said 
table's columns, 

said third phase of said moving step including updating said schema information for said one database table 
to indicate said new partition's memory location. 

A computer-readable memory configured to direct a computer system to store and provide user access to data in 
stored database objects, and comprising: 

a transaction manager (134) for managing computational transactions that add, delete and alter data stored 
in said database objects; said database objects (1 20, 200) comprising database tables, partitions and indices; 
said transaction manager including audit trail generation instructions for generating an audit trail (138), said 
audit trail including audit records, at least a subset of said audit records each denoting an event selected from 
the set consisting essentially of addition, deletion and alteration of specified data in a specified one of said 
database objects; and 

a restructuring procedure for restructuring a specified one of said database objects; 



32 



EP 0 723 238 B1 



characterized In that said restructuring procedure includes: 

first phase instructions for accessinq data in a first of said objects and storing corresponding data in a second 
of said objects while allowing continued performance of computational transactions against said first object; 
second phase instructions, for execution after said first phase instructions, for accessing said audit records in 
said audit trail created by said transaction manager during execution of said first phase instructions and there- 
after until execution of said second phase instructions is completed; said second phase instructions updating 
said data stored in said second object by redoing with respect to said second object each event denoted by 
said accessed audit trail records; and 

third phase instructions, for execution after said second phase instructions, for obtaining a lock on said first 
object so as to prevent continued performance of computational transactions against said first object, and for 
then accessing audit records in said audit trail created by said transaction manager after execution of said 
second phase instructions; said third phase instructions updating said data in said second object by redoing 
with respect to said second object each event denoted by said accessed audit trail records. 

10. The computer-readable memory of claim 9, further characterized by: 

at least one of said database tables having a plurality of columns, a plurality of records having primary key 
values based on a first specified subset of said columns, and a primary key index based on said primary key 
values; 

said restructuring procedure comprising a create index procedure for creating an alternate key index having 
a unique alternate key value for each record; 

said first object comprising one of said database tables and said second object comprising an alternate key 
index generated by said create index procedure; 

said first phase instructions including instructions for accessing said records in said one database table and 
instructions for generating and storing a corresponding index record in said alternate key index, each record 
in said alternate key index including said primary key value for said corresponding database table record and 
an alternate key value based on a second specified subset of said columns; 

said second phase and third phase instructions including instructions for deleting an index record in said 
alternate key index when one of said accessed audit records indicates an event making said alternate key 
index record invalid, and instructions for adding index records to said alternate key index when said accessed 
audit records indicate events that add new database records and events that alter any field of an existing 
database record corresponding to any column in said first and second specified subsets of columns; and 
said generating and storing instructions in said first phase instructions, and said adding index records instruc- 
tions in said second phase and third phase instructions including instructions for determining whether two 
records in said one database table have identical alternate key index values and for aborting said create index 
procedure when two such records are found. 

11. The computer-readable memory of claim 9, further characterized by: 

at least a subset of said database tables each including a plurality of partitions, each partition storing records 
having primary key values in a primary key range distinct from all others of said plurality of partitions; each 
partition stored as a separate data structure in memory of said computer system; 

said restructuring procedure comprising a move partition procedure for moving any specified one of said par- 
titions for one of said database tables having a plurality of partitions from a first memory location to a second 
memory location; 

said first phase instructions comprising instructions for creating a new partition at said second memory location 
and for copying all records of said specified partition into said new partition, using read only access to said 
records of said specified partition while performing said copying; 

said second phase instructions update said records in said new partition so as to redo each database table 
record event denoted by said accessed audit trail records; and 

said third phase instructions comprising instructions, for execution after said second phase instructions, for 
obtaining a lock on said one database table, then accessing audit records in said audit trail created by said 
transaction manager after execution of said second phase instructions; said third phase instructions updating 
said records in said new partition so as to redo each database table record event denoted by said accessed 
audit trail records. 
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PatentansprQche 

1. Computersystem zum Speichern von und Bereitstellen eines Nutzerzugriffs auf Daten in gespeicherten Daten- 
bankobjeKten, umTassend: 

einen Speicher (116) zum Speichern der Datenbankobjekte (120, 200), wobei die Datenbankobjekte Daten- 
banktabellen, Datenbankpartitionen und Datenbankindizes umfassen, wobei der Speicher in einer Mehrzahl 
von miteinander verbundenen Computerknoten (102) liegt; 
eine zentrale Verarbeitungseinheit (110); 

einen Transaktionsmanager (134), welcher mit dem Speicher gekoppelt ist, urn Computer-Transaktionen zu 
verwalten, die in den Datenbankobjekten gespeicherte Daten hinzufugen, loschen und verandern; wobei der 
Transaktionsmanager (134) Prufpfad-Erzeugungsanweisungen umfasst, urn einen Prufpfad (138) zu erzeu- 
gen und urn den Prufpfad in dem Speicher zu speichern, wobei der Prufpfad Prufdatensatze umfasst, wobei 
von wenigstens einem Untersatz der Prufpfad-Datensatze jeder ein Ereignis bezeichnet, das aus dem Satz 
ausgewahlt ist, der im Wesentlichen aus Hinzufugung, Loschung und Veranderung von spezifizierten Daten 
in einem spezifizierten der Datenbankobjekte besteht; 

eine Restrukturierungsprozedur, welche durch die zentrale Verarbeitungseinheit ausfuhrbar ist, urn ein spe- 
zifiziertes der Datenbankobjekte zu restrukturieren; 

dadurch gekennzeichnet, dass die Restrukturierungsprozedur umfasst: 

Erste-Phase-Anweisungen, urn auf Daten in einem ersten der Objekte zuzugreifen und urn entsprechende 
Daten in einem zweiten der Objekte zu speichern, wahrend eine fortgesetzte Durchfiihrung von Computer- 
Transaktionen gegen das erste Objekt gestattet wird; 

Zweite-Phase-Anweisungen, zur Ausfuhrung nach den Erste-Phase-Anweisungen, urn auf die Prufdatensatze 
in dem Prufpfad zuzugreifen, welche durch den Transaktionsmanager wahrend einer Ausfuhrung der Erste- 
Phase-Anweisungen und danach, bis eine Ausfuhrung der Zweite-Phase-Anweisungen abgeschlossen ist, 
erzeugt werden; wobei die Zweite-Phase-Anweisungen die in dem zweiten Objekt gespeicherten Daten ak- 
tualisieren, indem sie jedes Ereignis, das durch die Prufpfad-Datensatze bezeichnet wird, auf die zugegriffen 
wird, bezuglich des zweiten Objekts wiederholen; sowie 

Dritte-Phase-Anweisungen, zur Ausfuhrung nach den Zweite-Phase-Anweisungen, urn eine Sperre des ersten 
Objekts zu erhalten, urn eine fortgesetzte Durchfiihrung von Computer-Transaktionen gegen das erste Objekt 
zu verhindern, sowie urn dann auf Prufdatensatze in dem Prufpfad zuzugreifen, welche durch den Transakti- 
onsmanager nach einer Ausfuhrung der Zweite-Phase-Anweisungen erzeugt werden; wobei die Dritte-Phase- 
Anweisungen die Daten in dem zweiten Objekt aktualisieren, indem sie jedes Ereignis, das durch die Prufpfad- 
Datensatze bezeichnet wird, auf die zugegriffen wird, bezuglich des zweiten Objekts wiederholen. 

2. Computersystem des Anspruchs 1 , ferner dadurch gekennzeichnet: 

dass wenigstens eine der Datenbanktabellen eine Mehrzahl von Spalten aufweist, wobei eine Mehrzahl von 
Datensatzen Primarschlusselwerte auf Grundlage eines ersten spezifizierten Untersatzes der Spalten sowie 
einen Primarschlusselindex auf Grundlage der Primarschlusselwerte aufweist; 

dass die Restrukturierungsprozedur eine Index-erzeugen-Prozedur umfasst, urn einen Alternativschlusselin- 
dex mit einem eindeutigen Alternativschlusselwert fur jeden Datensatz zu erzeugen; 
dass das erste Objekt eine der Datenbanktabellen umfasst und das zweite Objekt einen durch die Index- 
erzeugen-Prozedur erzeugten Alternativschlusselindex umfasst; 

dass die Erste-Phase-Anweisungen Anweisungen umfassen, um auf die Datensatze in der einen Datenbank- 
tabelle zuzugreifen und Anweisungen umfassen, um einen entsprechenden Indexdatensatz in dem Alterna- 
tivschlusselindex zu erzeugen und zu speichern, wobei jeder Datensatz in dem Alternativschlusselindex den 
Primarschlusselwertfurden entsprechenden Datenbanktabellen-Datensatz und einen Alternativschlusselwert 
auf Grundlage eines zweiten spezifizierten Untersatzes der Spalten umfasst; 

dass die Zweite-Phase- und Dritte-Phase-Anweisungen Anweisungen umfassen, um einen Indexdatensatz 
in den Altemativschlusselindexdaten dann zu loschen, wenn einer der Prufdatensatze, auf die zugegriffen 
wird, ein Ereignis anzeigt, das den Alternativschlusselwert-Datensatz ungultig macht, und Anweisungen um- 
fassen, um Indexdatensatze zu dem Alternativschlusselindex dann hinzufugen, wenn die Prufdatensatze, auf 
die zugegriffen wird, Ereignisse anzeigen, die neue Datenbank-Datensatze hinzufugen, und Ereignisse an- 
zeigen, die irgendein Feld eines bestehenden Datenbank-Datensatzes andern, entsprechend irgendeiner 
Spalte in dem ersten und dem zweiten spezifizierten Untersatz von Spalten; und 
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dass die Erzeugungs- und Speicherungsanweisungen in den Erste-Phase-Anweisungen und die Indexdaten- 
satze-hinzufiigen-Anweisungen in den Zweite-Phase- und den Dritte-Phase-Anweisungen Anweisungen um- 
fassen, um zu bestimmen, ob zwei Datensatze in der einen Datenbank identische Alternativschlusselindex- 
wene auTweisen, und um die Index-erzeugen-Prozedur dann abzubrechen, wenn zwei derartige Datensatze 
gefunden werden. 

Computersystem des Anspruchs 1 , ferner gekennzeichnet durch: 

wenigstens einen Untersatz der Datenbanktabellen, wobei jede eine Mehrzahl von Partitionen umfasst und 
wobei jede Partition Datensatze mit Primarschlusselwerten in einem Primarschlusselbereich, welcher von alien 
anderen aus der Mehrzahl von Partitionen verschieden ist, speichert; wobei jede Partition in dem Speicher als 
eine gesonderte Datenstruktur gespeichert wind; 
und ferner dadurch gekennzeichnet: 

dass die Restrukiurierungsprozedur eine Partition-bewegen-Prozedur umfasst, um irgendeine spezifizierte 
der Partitionen fur eine der Datenbanktabellen, welche eine Mehrzahl von Partitionen aufweist, von einem 
ersten Speicherort zu einem zweiten Speicherort zu bewegen; 

dass die Erste-Phase-Anweisungen Anweisungen umfassen, um eine neue Partition bei dem zweiten Spei- 
cherort zu erzeugen und um alle Datensatze der spezifizierten Partition in die neue Partition zu kopieren , unter 
Verwendung von Nur-Lesezugriff auf die Datensatze der spezifizierten Partition, wahrend das Kopieren durch- 
gefuhrt wird; 

dass die Zweite-Phase- Anweisungen die Datensatze in der neuen Partition derart aktualisieren , dass sie jedes 
Datenbanktabellen-Datensatzereignis wiederholen, welches durch die Prufpfad-Datensatze bezeichnet wird, 
auf die zugegriffen wird; und 

dass die Dritte-Phase-Anweisungen Anweisungen umfassen, zur Ausfuhrung nach den Zweite-Phase-Anwei- 
sungen, um eine Sperre der einen Datenbanktabellezu erhalten, um dann auf Pruf datensatze in dem Prufpfad 
zuzugreifen, welche durch den Transaktionsmanager nach Ausfuhrung der Zweite-Phase-Anweisungen er- 
zeugt werden; wobei die Dritte-Phase-Anweisungen die Datensatze in der neuen Partition derart aktualisieren, 
dass sie jedes Datenbanktabellen-Datensatzereignis wiederholen, welches durch die Prufpfad-Datensatze 
bezeichnet wird, auf die zugegriffen wird. 

Computersystem des Anspruchs 3, ferner dadurch gekennzeichnet: 

dass die eine Datenbanktabelle ein zugeordnetes Schema aufweist, welches Informationen uber die Spalten 
der einen Datenbanktabelle bezeichnet; und 

dass die Dritte-Phase-Anweisungen Anweisungen umfassen, um die Schemainformationen fur die eine Da- 
tenbanktabelle derart zu aktualisieren, dass sie den Speicherort der neuen Partition anzeigen, und um dann 
die Sperre der einen Datenbanktabelle zu losen. 

Verfahren zum Speichern von und Bereitstellen von Nutzerzugriff auf Daten in gespeicherten Datenbankobjekten, 
welches die folgenden Schritte umfasst: 

Speichern der Datenbankobjekte (120, 200) im Speicher (116), wobei der Speicher in einer Mehrzahl von 
miteinander verbundenen Computerknoten (102) liegt, und wobei die Datenbankobjekte Datenbanktabellen, 
Partitionen und indizes umfassen; 

Erzeugen eines Priifpfads und Speichern des Prtifpfads in dem Speicher, wahrend Computer-Transaktionen 
durchgefuhrt werden, welche in den Datenbankobjekten gespeicherte Daten hinzufugen, loschen und veran- 
dern, wobei der Prufpfad Prufdatensatze umfasst, wobei von wenigstens einem Untersatz der Prufdatensatze 
jeder ein Ereignis bezeichnet, das ausgewahlt ist aus dem Satz, der im Wesentlichen aus Hinzufugung, L6- 
schung und Veranderung von spezifizierten Daten in einem spezifizierten der Datenbankobjekte besteht; und 
Restrukturieren eines spezifizierten der Datenbankobjekte, 

dadurch gekennzeichnet, dass der Restrukturierungsschritt umfasst: 

in einer ersten Phase Zugreifen auf Daten in einem ersten der Objekte und Speichern entsprechender Daten 
in einem zweiten der Objekte, wahrend eine fortgesetzte Durchfuhrung von Computer-Transaktionen gegen 
das erste Objekt gestattet wird; 

in einer zweiten Phase Zugreifen auf die Prufdatensatze in dem Prufpfad, welche wahrend der ersten Phase 
und danach , bis eine Durchfuhrung der zweiten Phase abgeschlossen ist, erzeugt werden, und Aktualisieren 
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der in dem zweiten Objekt gespeicherten Daten, indem jedes Ereignis, das durch die Prufpfad-Datensatze 
bezeichnet wird, auf die zugegriffen wird, bezuglich des zweiten Objekts wiederholt wird; sowie 
in einer dritten Phase Erhalten einer Sperre des ersten Objekts, um eine fortgesetzte Durchfuhrung von Com- 
puter-Transaktionen gegen das erste Objekt zu verhindern, und dann Zugreifen auf Prufdatensatze in dem 
Prufpfad, welche nach der zweiten Phase erzeugt werden, und Aktualisieren der Daten in dem zweiten Objekt, 
indem jedes Ereignis, das durch die Prufpfad-Datensatze bezeichnet wird, auf die zugegriffen wird, bezuglich 
des zweiten Objekts wiederholt wird. 

Verfahren des Anspruchs 5, femer dadurch gekennzeichnet: 

dass wenigstens eine der Datenbanktabeilen eine Mehrzahl von Spaften aufweist, wobei eine Mehrzahl von 
Datensatzen Primarschlusselwerte auf Grundiage eines ersten spezifizierten Untersatzes der Spatten sowie 
einen Primarschlusselindex auf Grundiage der Primarschlusselwerte aufweist; 

dass der Restrukturierungsschritt einen Schritt zum Erzeugen eines Altemativschlusselindex mit einem ein- 
deutigen Altemativschlusselwert furjeden Datensatz umfasst; 

dass das erste Objekt eine der Datenbanktabeilen umfasst und das zweite Objekt den durch den Index-er- 
zeugen-Schritt erzeugten Altemativschlusselindex umfasst; 
wobei der Restrukturierungsschritt umfasst: 

in der ersten Phase Zugreifen auf die Datensatze in der einen Datenbanktabelle und Erzeugen und Spei- 
chern eines entsprechenden Indexdatensatzes in dem Altemativschlusselindex, wobei jeder Datensatz 
in dem Altemativschlusselindex den Primarschlusselwert fur den entsprechenden Datenbanktabellen- 
Datensatz und einen Altemativschlusselwert auf Grundiage eines zweiten spezifizierten Untersatzes der 
Spalten umfasst; 

in der zweiten Phase und der dritten Phase Loschen eines Indexdatensatzes in dem Altemativschlusse- 
lindex dann, wenn einer der Prufdatensatze, auf die zugegriffen wird, ein Ereignis anzeigt, das den Alter- 
nativschlusselindex-Datensatz ungultig macht, und Hinzufugen von Indexdatensatzen zu dem Altemativ- 
schlusselindex dann, wenn die Prufdatensatze, auf die zugegriffen wird, Ereignisse anzeigen, die neue 
Datenbankdatensatze hinzufugen und Ereignisse anzeigen, die irgendein Feld eines bestehenden Da- 
te n ban kdatensatzes verandern, entsprechend irgendeiner Spalte in dem ersten und dem zweiten spezi- 
fizierten Untersatz von Spalten; sowie 

Bestimmen, ob zwei Datensatze in der einen Datenbanktabelle identische Alternativschliisselindex- 
werte aufweisen, und Abbrechen des Restrukturierungsschritts dann, wenn zwei derartige Datensatze 
gefunden werden. 

Verfahren des Anspruchs 5, femer gekennzeichnet durch: 

wenigstens einen Untersatz der Datenbanktabeilen, wobei jede eine Mehrzahl von Partitionen umfasst und 
wobei jede Partition Datensatze mit Primarschlusselwerten in einem Primarschlusselbereich, der von alien ande- 
ren aus der Mehrzahl von Partitionen verschieden ist, speichert; wobei jede Partition als eine gesonderte Daten- 
struktur in dem Speicher gespeichert wird; und ferner dadurch gekennzeichnet: 

dass der Restrukturierungsschritt ein Bewegen irgendeiner spezifizierten der Partitionen fur eine der Daten- 
banktabeilen, welche eine Mehrzahl von Partitionen aufweist, von einem ersten Speicherort zu einem zweiten 
Speicherort umfasst; wobei der Bewegungsschritt umfasst: 

in der ersten Phase Erzeugen einer neuen Partition bei dem zweiten Speicherort und Kopieren aller Daten- 
satze der spezifizierten Partition in die neue Partition, unter Verwendung von Nur-Lesezugriff auf die Daten- 
satze der spezifizierten Partition, wahrend das Kopieren durchgefuhrt wird; 

in der zweiten Phase Zugreifen auf Prufdatensatze in dem Prufpfad, welche wahrend einer Ausfiihrung der 
ersten Phase und danach, bis eine Durchfuhrung der zweiten Phase abgeschlossen ist, erzeugt werden, und 
Aktualisieren der Datensatze in der neuen Partition derart, dass jedes Datenbanktabellen-Datensatzereignis 
wiederholt wird, welches durch die Prufpfad-Datensatze bezeichnet wird, auf die zugegriffen wird; und 
in der dritten Phase Erhalten einer Sperre der einen Datenbanktabelle, dann Zugreifen auf die Prufdatensatze 
in dem Prufpfad, welche nach Vollendung der zweiten Phase erzeugt werden, und Aktualisieren der Daten- 
satze in der neuen Partition derart, dass jedes Datenbanktabellen-Datensatzereignis wiederholt wird, welches 
durch die Prufpfad-Datensatze bezeichnet wird, auf die zugegriffen wird; wobei die drftte Phase femer ein 
Losen der Sperre der einen Datenbanktabelle umfasst. 

Verfahren des Anspruchs 7, wobei jede Datenbanktabelle ein zugeordnetes Schema aufweist, welches Informa- 
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tionen uber die Spalten der Tabeile bezeichnet, 

wobei die dritte Phase des Bewegungsschritts ein Aktualisieren der Schemainformationen fur die eine Da- 
tenbanktabelle umfasst, urn den Speicherort der neuen Partition anzuzeigen. 

5 9. Computeriesbarer Speicher, welcher konfiguriert ist, urn ein Computersystem derart zu leiten, dass es Daten in 
gespeicherten Daten ban kobjekten speichert und Nutzerzugrilf darauf bereitstellt, und welcher umfasst: 

einen Transaktionsmanager (134), um Computer-Transaktionen zu verwalten, welche in den Datenbankob- 
jekten gespeicherte Daten hinzufugen, loschen und verandern; wobei die Datenbankobjekte (120, 200) Da- 

10 tenbanktabellen, Partitionen und indizes umfassen; wobei der Transaktionsmanager Prufpfad-Erzeugungs- 

anweisungen umfasst, um einen Prufpfad (138) zu erzeugen, wobei der Prufpfad Prufdatensatze umfasst, 
wobei von wenigstens einem Untersatz der Prufdatensatze jeder ein Ereignis bezeichnet, das aus dem Satz 
ausgewahlt ist, welcher im Wesentlichen aus Hinzufugung, Loschung und Veranderung von spezifizierten 
Daten in einem spezifizierten der Datenbankobjekte besteht; sowie 

15 eine Restrukturierungsprozedur, um ein spezifiziertes der Datenbankobjekte zu restrukturieren; 

dadurch gekennzeichnet, dass die Restrukturierungsprozedur umfasst: 

Erste-Phase-Anweisungen, um auf Daten in einem ersten der Objekte zuzugreifen und um entsprechende 
20 Daten in einem zweiten der Objekte zu speichern, wahrend eine fortgesetzte Durchfuhrung von Computer- 

Transaktionen gegen das erste Objekt gestattet wird; 

Zweite-Phase-Anweisungen, zur Ausfuhrung nach den Erste-Phase-Anweisungen, um auf die Prufdatensatze 
in dem Prufpfad zuzugreifen, welche durch den Transaktionsmanager wahrend einer Ausfuhrung der Erste- 
Phase-Anweisungen und danach, bis eine Ausfuhrung der Zweite-Phase-Anweisungen abgeschlossen ist, 
25 erzeugt werden; wobei die Zweite-Phase-Anweisungen die in dem zweiten Objekt gespeicherten Daten ak- 

tualisieren, indem sie jedes Ereignis, das durch die Prufpfad-Datensatze bezeichnet wird, auf die zugegriffen 
wird, bezuglich des zweiten Objekts wiederholen; sowie 

Dritte-Phase-Anweisungen, zur Ausfuhrung nach den Zweite-Phase-Anweisungen, um eine Sperre des ersten 
Objekts zu erhalten, um eine fortgesetzte Durchfuhrung von Computer-Transaktionen gegen das erste Objekt 
30 zu verhindern und um dann auf Prufdatensatze in dem Prufpfad zuzugreifen, welche durch den Transaktions- 

manager nach einer Ausfuhrung der Zweite-Phase-Anweisungen erzeugt werden; wobei die Dritte-Phase- 
Anweisungen die Daten in dem zweiten Objekt aktualisieren, indem sie jedes Ereignis, das durch die Prufpfad- 
Datensatze bezeichnet wird, auf die zugegriffen wird, bezuglich des zweiten Objekts wiederholen. 

35 10. Computeriesbarer Speicher des Anspruchs 9, ferner dadurch gekennzeichnet: 

dass wenigstens eine der Datenbanktabellen eine Mehrzahl von Spalten aufweist, wobei eine Mehrzahl von 
Datensatzen Primarschlusselwerte auf Grundlage eines ersten spezifizierten Untersatzes der Spalten sowie 
einen Primarschlusselindex auf Grundlage der Primarschlusselwerte aufweist; 
to dass die Restrukturierungsprozedur eine Index-erzeugen-Prozedur umfasst, um einen Altern at ivsch I usse (in- 

dex mit einem eindeutigen Altemativschlusselwertfur jeden Datensatz zu erzeugen; 
dass das erste Objekt eine der Datenbanktabellen umfasst und das zweite Objekt einen Alternativschlusse- 
lindex umfasst, welcher durch die Index-erzeugen-Prozedur erzeugt wird; 

dass die Erste-Phase-Anweisungen Anweisungen umfassen, um auf Datensatze in der einen Datenbankta- 
45 belle zuzugreifen und Anweisungen umfassen, um einen entsprechenden indexdatensatz in dem Alternativ- 

schlusselindex zu erzeugen und zu speichern, wobei jeder Datensatz in dem Alternativschlusselindex den 
Primarschlusselwertfur den entsprechenden Datenbanktabellen-Datensatz und einen Alternativschlusselwert 
auf Grundlage eines zweiten spezifizierten Untersatzes der Spalten umfasst; 

dass die Zweite-Phase- und Dritte-Phase-Anweisungen Anweisungen umfassen, um einen Indexdatensatz 
50 in dem Alternativschlusselindex dann zu loschen, wenn einer der Prufdatensatze, auf die zugegriffen wird, ein 

Ereignis anzeigt, das den Alternativschlusselindex-Datensatz ungultig macht, und Anweisungen umfassen, 
um Indexdatensatze zu dem Alternativschlusselindex dann hinzuzufugen, wenn die Prufdatensatze, auf die 
zugegriffen wird, Ereignisse anzeigen, die neue Datenbankdatensatze hinzufugen und Ereignisse anzeigen, 
die irgendein Feld eines bestehenden Datenbank-Datensatzes verandern, entsprechend irgendeiner Spalte 
55 in dem ersten und dem zweiten spezifizierten Untersatz von Spalten; und 

dass die Erzeugungs- und Speicherungsanweisungen in den Erste-Phase-Anweisungen und die Indexdaten- 
satze-hinzufugen-Anweisungen in den Zweite-Phase- und Dritte-Phase-Anweisungen Anweisungen umfas- 
sen, um zu bestimmen, ob zwei Datensatze in der einen Datenbanktabelle identische Alternativschlusselin- 



37 



EP0 723 238 B1 

dexwerte aufweisen und urn die Index-erzeugen-Prozedur dann abzubrechen, wenn zwei derartige Datensat- 
ze gefunden werden. 

11 . Computerlesbarer Speicher des Anspruchs 9, ferner 
5 gekennzeichnet durch: 

wenigstens einen Untersatz der Datenbanktabellen, wobei jede eine Mehrzahl von Partitionen umfasst und 
wobei jede Partition Datensatze mit Primarschlusselwerten in einem PrimarschlGsselwertbereich, der von alien 
anderen aus der Mehrzahl von Partitionen verschieden ist, speichert; wobei jede Partition als eine gesonderte 
Datenstruktur im Speicher des Computersystems gespeichert wird; und ferner dadurch gekennzeichnet: 

10 

dass die Restrukturierungsprozedur eine Partition-bewegen-Prozedur umfasst, urn irgendeine spezifizierte 
der Partitionen fur eine der Datenbanktabellen, welche eine Mehrzahl von Partitionen aufweist, von einem 
ersten Speicherort zu einem zweiten Speicherort zu bewegen; 

dass die Erste-Phase-Anweisungen Anweisungen umfassen, urn eine neue Partition bei dem zweiten Spei- 
15 cherort zu erzeugen und urn alle Datensatze der spezif izierten Partition in die neue Partition zu kopieren , unter 

Verwendung von Nur-Lesezugriff auf die Datensatze der spezif izierten Partition, wahrend das Kopieren durch- 
gefuhrt wird; 

dass dieZweite-Phase-Anweisungen die Datensatze in der neuen Partition derart aktualisieren, dass sie jedes 
Datenbanktabellen-Datensatzereignis wiederholen, welches durch die Prufpfad-Datensatze bezeichnet wird, 

20 auf die zugegriffen wird; und 

dass die Dritte-Phase- Anweisungen Anweisungen umfassen, zur Ausfuhrung nach den Zweite- Phase- Anwei- 
sungen, um eine Sperre der einen Datenbanktabelle zu erhalten, urn dann auf Pruf datensatze in dem Prufpfad 
zuzugreifen, welche durch den Transaktionsmanager nach Ausfuhrung der Zweite-Phase-Anweisungen er- 
zeugt werden; wobei die Dritte-Phase-Anweisungen die Datensatze in der neuen Partition derart aktualisieren, 

25 dass sie jedes Datenbanktabellen-Datensatzereignis wiederholen, welches durch die Prufpfad-Datensatze 

bezeichnet wird, auf die zugegriffen wird. 



R eve ndicat ions 

30 

1. Systeme informatique destine k memoriser et k donner, k un utilisateur, acces a des donnees dans des objets 
memorises de base de donnees, comprenant : 

une memoire (116) destinee a memoriser lesdits objets (1 20, 200) de base de donnees, lesdits objets de base 
35 de donnees comprenant des tables de base de donnees, des partitions de base de donnees et des index de 

base de donnees, ladite memoire rSsidant dans une plurality de noeuds informatiques interconnects (102) ; 
une unite centrale de traitement (110) ; 

un gestionnaire (1 34) de transactions, associe k ladite memoire, destine k gerer des transactions informatiques 
qui ajoutent, suppriment et modifient des donn6es m6moris6es dans lesdits objets de base de donnees ; ledit 

40 gestionnaire (1 34) de transactions incluant des instructions de production de journal temoin destine a engen- 

drer un journal temoin (138) et k memoriser ledit journal temoin dans ladite memoire, ledit journal temoin 
incluant des enregistrements temoins, chacun d'au moins un sous-ensemble desdits enregistrements temoins 
notant un 6v6nement choisi k partir de ['ensemble constitu6 essentiellement d'addition, de suppression et de 
modification de donnees sp6cifi6es dans Tun, sp6cifie, desdits objets de base de donnees ; 

45 une procedure de restructuration, executable par ladite unite centrale de traitement, destinee k restructurer 

Tun, sp6cifi6, desdits objets de base de donnees ; 

caracterise en ce que ladite procedure de restructuration comprend : 

50 des instructions d'une premiere phase destinies k acceder k des donn6es dans un premier desdits objets et 

k memoriser des donnees correspondantes dans un deuxieme desdits objets tout en autorisant la poursuite 
de I'execution de transactions informatiques sur ledit premier objet ; 

des instructions d'une deuxieme phase, pour execution aprfcs lesdites instructions de premiere phase, desti- 
nies k acceder auxdits enregistrements temoins dans ledit journal temoin cree par ledit gestionnaire de tran- 
55 sactions pendant I'execution desdites instructions de premiere phase et ensuite jusqu'6 ce que l*ex6cution 

desdites instructions de deuxieme phase soit terminee ; lesdites instructions de deuxieme phase mettant k 
jour lesdites donnees memoris6es dans ledit deuxieme objet en refaisant, en ce qui concerne ledit deuxieme 
objet, chaque ev6nement note par lesdits enregistrements de journal temoin auxquels on a accede ; et 
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des instructions d'une troisteme phase, pour execution apr&s lesdites instructions de deuxfeme phase, desti- 
nies k obtenir un verrouillage sur ledit premier objet de fagon a privenir la poursuite de Tex6cution de tran- 
sactions informatiques sur ledit premier objet et k acxteder alors aux enregistrements temoins dans ledit journal 
temom cree par ledit gestionnaire de transactions apres execution desdites instructions de deuxieme phase ; 
lesdites instructions de troisfeme phase mettant k jour lesdites donnies dans ledit deuxteme objet en refaisant, 
en ce qui concerne ledit deuxfeme objet, chaque 6v6nement note par lesdits enregistrements de journal temoin 
auxquels on a acc6d§. 

Systeme informatique selon la revendication 1, caracterisi en outre : 

en ce qu'au moins Tune desdites tables de base de donnies comporte plusieurs colonnes, plusieurs enregis- 
trements ayant des valeurs de cte primaire bashes sur un premier sous-ensemble sp6cif te desdites colonnes, 
et un index de cte primaire bas6 sur lesdites valeurs de cte primaire ; 

en ce que ladite procedure de restructuration comprend une procedure de creation d'index destinie k cteer 
un index de cle secondaire comportant une unique valeur de cte secondaire pour chaque enregistrement ; 
en ce que ledit premier objet comprend Tune desdites tables de base de donnies et en ce que ledit deuxfeme 
objet comprend un index de cte secondaire engendri par ladite procedure de creation d'index ; 
en ce que lesdites instructions de premiere phase comprennent des instructions pour aoteder auxdits enre- 
gistrements dans ladite une table de base de donnSes et des instructions pour engendrer et rrtemoriser un 
enregistrement d'index correspondant dans ledit index de cte secondaire, chaque enregistrement dans ledit 
index de cte secondaire incluant ladite valeur de cte primaire pour ledit enregistrement correspondant de table 
de base de donnees et une valeur de cte secondaire basee sur un deuxieme sous-ensemble specifie desdites 
colonnes ; 

en ce que lesdites instructions de deuxteme phase et de troisi&me phase comprennent des instructions pour 
supprimer un enregistrement d'index dans ledit index de cte secondaire lorsque i'un desdits enregistrements 
temoins auxquels on a acc6d6 indique un ^tenement rendant invalide ledit enregistrement de cte secondaire, 
et des instructions pour ajouter des enregistrements d'index audit index de cl6 secondaire lorsque lesdits 
enregistrements temoins auxquels on a acc6d6 indiquent des 6v6nements qui ajoutent de nouveaux enregis- 
trements de base de donnies et des ^tenements qui modifient une quelconque zone d'un enregistrement de 
base de donnees existent correspondant k une quelconque colonne dans lesdits premier et deuxteme sous- 
ensembles specifies de colonnes ; et 

en ce que lesdites instructions de production et de memorisation dans lesdites instructions de premiere phase, 
et lesdites instructions d'ajout d'enregistrements d'index dans lesdites instructions de deuxieme phase et de 
troisieme phase comprennent des instructions pour determiner si deux enregistrements dans ladite table de 
base de donnies ont des valeurs identiques d'index de cte secondaire et pourfaire avorter ladite procedure 
de creation d'index lorsque Ton trouve deux de ces enregistrements. 

Systeme informatique selon la revendication 1, caracterisi en outre : 

par au moins un sous-ensemble desdites tables de base de donnies comprenant chacun une plurality de 
partitions, chaque partition rrtemorisant des enregistrements ayant des valeurs de cle primaire dans une plage 
de des primaires distincte de toutes les autres de ladite pluralite de partitions ; chaque partition etant memo- 
rise comme une structure de donnees distincte dans ladite rrtemoire ; 

en ce que ladite procedure de restructuration comprend une procedure de diplacement de partition, d'un 
premier emplacement de rrtemoire k un second emplacement de rrtemoire, destinie a diplacer I'une quel- 
conque, spiciftee, desdites partitions pour I'une desdites tables de base de donnees comportant une pluralite 
de partitions ; 

en ceque lesdites instructions de premiere phase comprennent des instructions destinies Verier une nouvelle 
partition au niveau dudit second emplacement de rrtemoire et k copier tous les enregistrements de ladite 
partition spiciftee dans ladite nouvelle partition, en utilisant un acc&s a lecture seule auxdits enregistrements 
de ladite partition spiciftee tout en effectuant ladite copie ; 

en ce que lesdites instructions de deuxteme phase mettent a jour lesdits enregistrements dans ladite nouvelle 
partition de facon a refaire chaque 6v6nement d'enregistrement de table de base de donnies note par lesdits 
enregistrements de journal temoin auxquels on a accidi ; et 

en ce que lesdites instructions de troisteme phase comprennent des instructions, pour execution aprds lesdites 
instructions de deuxteme phase, pour obtenir un verrouillage de ladite une table de base de donnies, puis 
pour accider auxdits enregistrements dans ledit journal temoin cr66 par ledit gestionnaire de transactions 
apr&s execution desdites instructions de deuxteme phase ; lesdites instructions de troisteme phase mettant 
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k jour lesdits enregistrements dans ladite nouvelle partition de fagon k refaire chaque 6v6nement d'enregis- 
trement de table de base de donn6es note par lesdits enregistrements de journal temoin auxquels on a acc6d6. 

Syst&me informatique selon la revendication 3, caracteris§ en outre : 

en ce que ladite une table de base de donnSes possfede un schema associS notant de reformation au sujet 
des colonnes de ladite une table de base de donnees ; et 

en ce que lesdites instructions de la troisi&me phase comprennent des instructions pour mettre k jour ladite 
information de schema pour ladite une table de base de donn6es pour indiquer ledit emplacement de rrtemoire 
de ladite nouvelle partition, et pour Iib6rer alors ledit verrouiiiage de ladite une table de base de donn§es. 

Proc6d6 destine k nrtemoriser et k donner, k un utilisateur, accfcs k des donnSes dans des objets m6moris6s de 
base de donnSes, comprenant les Stapes consistant : 

k memoriser lesdits objets (120, 200) de base de donnees, dans une m6moire (116), ladite memoire rdsidant 
dans une plurality de noeuds informatiques interconnects (102) et lesdits objets de base de donn6es com- 
prenant des tables, des partitions et des index de base de donnees ; 

tout en effectuant des transactions informatiques qui ajoutent, suppriment et modifient des donnees nrtemori- 
s6es dans lesdits objets de base donnees, k engendrer un journal temoin et k memoriser ledit journal temoin 
dans ladite nrtemoire, ledit journal temoin incluant des enregistrements temoins, chacun d'au moins un sous- 
ensemble desdits enregistrements temoins notant un 6v6nement choisi k partir de I'ensemble constitu6 es- 
sentiellement d'addition, de suppression et de modification de donnees spSciftees dans I'un, sp6cifte, desdits 
objets de base de donnees ; 

k restructurer Tun, sp6cifi6, desdits objets de base de donn6es ; 

caracterlse en ce que ladite 6tape de restructuration comprend : 

dans une premiere phase, fe fait d'accSder k des donnees dans un premier desdits objets et de memoriser 
des donnees correspondantes dans un deuxfeme desdits objets tout en autorisant la poursuite de l'ex6cution 
de transactions informatiques sur ledit premier objet ; 

dans une deuxieme phase, le fait d'acc6der auxdits enregistrements temoins dans ledit journal temoin cree 
pendant ladite premiere phase et ensuite jusqu'& ce que I'exgcution de ladite deuxieme phase soit terminee, 
et le fait de mettre k jour lesdites donnees memorises dans ledit deuxfeme objet en refaisant, en ce qui 
concerne ledit deuxi&me objet, chaque §v6nement note par lesdits enregistrements de journal temoin auxquels 
on a acc£d£ ; et 

dans une troisieme phase, le fait d'obtenirun verrouiiiage sur ledit premier objet defa$on &pr§venir la poursuite 
de Tex6cution de transactions informatiques sur ledit premier objet et d'acxteder alors aux enregistrements 
temoins dans ledit journal temoin engendrS aprfes ladite deuxieme phase, et le fait de mettre k jour lesdites 
donn§es dans ledit deuxifeme objet en refaisant, en ce qui concerne ledit deuxieme objet, chaque 6v6nement 
note par lesdits enregistrements de journal temoin auxquels on a acc&te. 

Proc6de selon la revendication 5, caracterisS en outre : 

en ce qu'au moins Tune desdites tables de base de donnees comporte plusieurs colonnes, plusieurs enregis- 
trements ayant des valeurs de cl6 primaire basees sur un premier sous-ensemble specific desdites colonnes, 
et un index de c!6 primaire bas6 sur lesdites valeurs de cl6 primaire ; 

en ce que ladite 6tape de restructuration comprend une 6tape destin6e k cr6er un index de cl§ secondaire 
comportant une unique vaieur de cl6 secondaire pour chaque enregistrement ; 

en ce que ledit premier objet comprend Tune desdites tables de base de donn6es et en ce que ledit deuxieme 
objet comprend ledit index de cl6 secondaire engendte par ladite 6tape de creation d'index ; 
en ce que ladite 6tape de restructuration comprend : 

dans ladite premiere phase, le fait d'accdder auxdits enregistrements dans ladite une table de base de 
donnees et d'engendrer et de nrtemoriser un enregistrement d'index correspondant dans ledit index de 
cl6 secondaire, chaque enregistrement dans ledit index de cl£ secondaire incluant ladite vaieur de cl£ 
primaire pour ledit enregistrement correspondant de table de base de donnees et une vaieur de cl6 se- 
condaire bas6e sur un deuxidme sous-ensemble sp6cifi6 desdites colonnes ; 

dans ladite deuxieme phase et dans ladite troisifcme phase, le fait de supprtmer un enregistrement d'index 
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dans ledit index de c!6 secondaire lorsque Fun desdits enregistrements temoins auxquels on a acc6d6 
indique un 6v6nement rendant invalide ledit enregistrement de c!6 secondaire, et le fait d'ajouter des 
cnrcglctrcments d'lndex audit index He QPrnnriairp lorsque lesdits enregistrements temoins auxquels 
on a accede indiquent des 6venements qui ajoutent de nouveaux enregistrements de base de donnees 
et des tenements qui modifient une quetconque zone d'un enregistrement de base de donnees existant 
correspondant k une quelconque colonne dans lesdits premier et deuxieme sous-ensembles specifies de 
colonnes ; et 

le fait de determiner si deux enregistrements dans tadite table de base de donnees ont des valeurs iden- 
tiques d'index de cl6 secondaire et de faire avorter ladite 6tape de restructuration lorsque Ton trouve deux 
de ces enregistrements. 

Procede selon la revendication 5, caracterise en outre : 

par au moins un sous-ensemble desdites tables de base de donnees comprenant chacun une plurality de 
partitions, chaque partition m&norisant des enregistrements ayant des valeurs de c\6 primaire dans une plage 
de cles primaires distincte de toutes les autres de ladite plurality de partitions ; chaque partition 6tant m6mo- 
ris§e comme une structure de donnees distincte dans ladite memoire ; 

en ce que ladite §tape de restructuration comprend le fait de deplacer, d'un premier emplacement de m6moire 
k un second emplacement de memoire, Tune quelconque, sp6cifi6e, desdites partitions pour Tune desdites 
tables de base de donnees comportant une plurality de partitions ; ladite 6tape de d6placement incluant : 

dans ladite premiere phase, le fait de creer une nouvelle partition au niveau dudit second emplacement 
de memoire et de copier tous les enregistrements de ladite partition specif i6e dans ladite nouvelle partition, 
en utilisant un accfes k lecture seule auxdits enregistrements de ladite partition sp6cif tee tout en effectuant 
ladite copie ; 

dans ladite deuxieme phase, le fait d'accdder auxdits enregistrements temoins dans ledit journal temoin 
cre6 pendant l'ex§cution de ladite premiere phase et ensuite jusqu'& ce que I'execution de ladite deuxi&me 
phase soit termin6e, et le fait de mettre k jour lesdits enregistrements dans ladite nouvelle partition de 
fa?on k refaire chaque evenement d'enregistrement de table de base de donnees note par lesdits enre- 
gistrements de journal temoin auxquels on a accede ; et 

dans ladite troisfeme phase, le fait d'obtenir un verrouillage de ladite une table de base de donnees, puis 
d'acc6der auxdits enregistrements dans ledit journal temoin cre6 apres I'acrtevement de ladite deuxieme 
phase ; et le fait de mettre k jour lesdits enregistrements dans ladite nouvelle partition de fagon k refaire 
chaque evenement d'enregistrement de table de base de donn§es note par lesdits enregistrements de 
journal temoin auxquels on a acc§d6 ; ladite troisieme phase comprenant en outre la liberation dudit ver- 
rouillage de ladite une table de base de donn6es. 

Proc6d6 selon la revendication 7, dans lequel chaque table de base de donnees poss&de un schema assocte 
notant de I'information au sujet des colonnes de ladite table ; et 

ladite troisi&me phase de ladite etape de d6placement comprenant le fait de mettre k jour ladite information 
de schema pour ladite une table de base de donn6es pour indiquer ledit emplacement de memoire de ladite 
nouvelle partition. 

Memoire lisible en ordinateur configure pour diriger un systeme informatique pour memoriser et donner, k un 
utilisateur, acc&s k des donnees dans des objets rrtemoris6s de base de donnees, et comprenant : 

un gestionnaire (1 34) de transactions destine k g§rer des transactions informatiques qui ajoutent, suppriment 
et modifient des donnees rrtemorisees dans lesdits objets de base donnees ; lesdits objets (1 20, 200) de base 
de donnees comprenant des tables, des partitions et des index de base de donnees ; ledit gestionnaire de 
transactions incluant des instructions de production de journal temoin destinees k engendrer un journal temoin 
(1 38), ledit journal temoin incluant des enregistrements temoins, chacun d'au moins un sous-ensemble desdits 
enregistrements temoins notant un evenement choisi k partir de I'ensemble constitue essentiellement d'addi- 
tion, de suppression et de modification de donnees sp^cifiees dans Tun, sp^cifie, desdits objets de base de 
donn§es ; 

une procedure de restructuration, destinee k restructurer Tun, sp£cifie, desdits objets de base de donnees ; 
caracterise en ce que ladite procedure de restructuration comprend : 
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des instructions d'une premiere phase destinies k acceder k des donnees dans un premier desdits objets et 
k memoriser des donnees correspondantes dans un deuxieme desdits objets tout en autorisant la poursuite 
de ('execution de transactions informatiques sur ledit premier objet : 

des instructions d'une deuxieme phase, pour execution aprds lesdites instructions de premiere phase, desti- 
nees k acceder auxdits enregistrements temoins dans ledit journal temoin cree par ledit gestionnaire de tran- 
sactions pendant TexScution desdites instructions de premiere phase et ensuite jusqu'& ce que l'ex6cution 
desdites instructions de deuxieme phase soit terminee ; lesdites instructions de deuxieme phase mettant k 
jour lesdites donnees nrtemorisees dans ledit deuxieme objet en refaisant, en ce qui concerne ledit deuxieme 
objet, chaque ev6nement note par lesdits enregistrements de journal temoin auxquels on a accede ; et 
des instructions d'une troisieme phase, pour execution apres lesdites instructions de deuxieme phase, desti- 
nees k obtenir un verrouillage sur ledit premier objet de fa9on k prevenir la poursuite de I'execution de tran- 
sactions informatiques sur ledit premier objet et k acceder alors aux enregistrements temoins dans ledit journal 
temoin cr66 par ledit gestionnaire de transactions apres execution desdites instructions de deuxieme phase ; 
lesdites instructions de troisieme phase mettant k jour lesdites donnees dans ledit deuxieme objet en refaisant, 
en ce qui concerne ledit deuxieme objet, chaque 6venement note par lesdits enregistrements de journal temoin 
auxquels on a accede. 

10. Memoire lisible en ordinateur selon ia revendication 9, caracterisee en outre : 

en ce qu'au moins I'une desdites tables de base de donnees comporte plusieurs colonnes, plusieurs enregis- 
trements ayant des valeurs de cie primaire bashes sur un premier sous-ensemble specifie desdites colonnes, 
et un index de cle primaire base sur lesdites valeurs de cl6 primaire ; 

en ce que ladite procedure de restructuration comprend une procedure de creation d'index destinee k creer 
un index de cie secondaire comportant une unique valeur de cie secondaire pour chaque enregistrement ; 
en ce que ledit premier objet comprend I'une desdites tables de base de donnees et en ce que ledit deuxieme 
objet comprend un index de cie secondaire engendre par ladite procedure de creation d'index ; 
en ce que lesdites instructions de premiere phase comprennent des instructions pour acceder auxdits enre- 
gistrements dans ladite une table de base de donnees et des instructions pour engendrer et memoriser un 
enregistrement d'index correspondant dans ledit index de cle secondaire, chaque enregistrement dans ledit 
index de cle secondaire incluant ladite valeur de cie primaire pour ledit enregistrement correspondant de table 
de base de donnees et une valeur de cl6 secondaire bas6e sur un deuxieme sous-ensemble specifie desdites 
colonnes ; 

en ce que lesdites instructions de deuxieme phase et de troisieme phase comprennent des instructions pour 
supprimer un enregistrement d'index dans ledit index de cie secondaire lorsque I'un desdits enregistrements 
temoins auxquels on a accede indique un 6v6nement rendant invalide ledit enregistrement de cie secondaire, 
et des instructions pour ajouter des enregistrements d'index audit index de cle secondaire lorsque lesdits 
enregistrements temoins auxquels on a accede indiquent des evSnements qui ajoutent de nouveaux enregis- 
trements de base de donnees et des evenements qui modifient une quelconque zone d'un enregistrement de 
base de donn6es existant correspondant k une quelconque colonne dans lesdits premier et deuxieme sous- 
ensembles specifics de colonnes ; et 

en ce que lesdites instructions de production et de memorisation dans lesdites instructions de premiere phase, 
et lesdites instructions d'ajout d'enregistrements d'index dans lesdites instructions de deuxieme phase et de 
troisieme phase comprennent des instructions pour determiner si deux enregistrements dans ladite table de 
base de donn6es ont des valeurs identiques d'index de cie secondaire et pourfaire avorter ladite procedure 
de creation d'index lorsque Ton trouve deux de ces enregistrements. 

11. M6moire lisible en ordinateur selon la revendication 9, caracteris6e en outre : 

par au moins un sous-ensemble desdites tables de base de donnees comprenant chacun une plurality de 
partitions, chaque partition memorisant des enregistrements ayant des valeurs de cl6 primaire dans une plage 
de c!6s primaires distincte de toutes les autres de ladite plurality de partitions ; chaque partition etant memo- 
risee comme une structure de donnees distincte dans la memoire dudit systeme informatique ; 
en ce que ladite procedure de restructuration comprend une procedure de deplacement de partition, d'un 
premier emplacement de memoire k un second emplacement de memoire, destinee k deplacer I'une quel- 
conque, specifiee, desdites partitions pour I'une desdites tables de base de donnees comportant une pluralite 
de partitions ; 

en ce que lesdites instructions de premiere phase comprennent des instructions destin6es k creer une nouvelle 
partition au niveau dudit second emplacement de memoire et k copier tous les enregistrements de ladite 
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partition speciftee dans ladite nouvelle partition, en utiiisant un accfcs k lecture seule auxdits enregistrements 
de ladite partition sp6cifi6e tout en effectuant ladite copie ; 

en que lesdites instructions dc dc«x;crr.e phase mcttcnt a jeurJesdits enregistrements d?T>s inriitp nnnvpiip 
partition de facon k refaire chaque 6v6nement d'enregistrement de table de base de donn6es note par lesdits 

5 enregistrements de journal temoin auxquels on a acc6d§ ; et 

en ce que lesdites instructions de troisfeme phase comprennent des instructions, pour execution aprds lesdites 
instructions de deuxfeme phase, pour obtenir un verrouillage de ladite une table de base de donn£es, puis 
pour acc6der auxdits enregistrements dans ledit journal temoin cr66 par ledit gestionnaire de transactions 
aprfes execution desdites instructions de deuxteme phase ; lesdites instructions de troisifcme phase mettant 

10 & jour lesdits enregistrements dans ladite nouvelle partition de facon k refaire chaque 6v6nement d'enregis- 

trement de table de base de donn6es note par lesdits enregistrements de journal temoin auxquels on a acc6d6. 
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Command: MOvePariiUon(rariitioniD, NewDiskLccatic.n) 



1 



^220 



Phase 1: "Dirty Copy" 



Create file for NewPartition. 

Create file label for NewPartition. 

Save position of last record in Audit Trait as AuditPtr. 

While allowing user transactions to continue accessing the 
specified OldPartition, copy all records of the OldPartition 
into the NewPartition without regard to the fact that records 
in the OldPartition may be in the process of being changed. 



r 222 



Phase 2: "Partial Cleanup" 



Set Filter for AuditTrail to access only audit trail records 
that pertain to the OldPartition. 

Starting with the audit record in the Audit Trail at the initial 
AuditPtr value, repeat the following steps for each 
sequential audit record that passes the Filter until the end 
of the Audit Trail is reached: 

Modify the audit record to refer to the NewPartition. 
If the change to the table noted in the audit record is 
already reflected in the copied records. 

Ignore the audit record 
Otherwise 

Perform a "redo" with respect to the modified audit 
record so as to apply the change noted in the audit 
record to the NewPartition. This includes updating 
the B-tree for the NewPartition as needed to 
perform the change. 



FIGURE 4A 



FIGURE 4A 
FIGURE 4B 
FIGURE 4 
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Phase 3: "Final Cleanup" 

Request and wait for lock on the OldPartition. 
As a unitary transaction: 
Repeat the following steps for each sequential audit 
record that passes the Filter until the last such audit 
record has been processed: 
Modify the audit record to refer to the NewPartition. 
If the change to the table noted in the audit record is 
already reflected in the copied records. 

Ignore the audit record 
Otherwise 

Perform a "redo" with respect to the modified audit 
record so as to apply the change noted in the audit 
record to the NewPartition. This includes updating 
the B-tree for the NewPartition as needed to 
perform the change. 

Request and wait for lock on the Base Table. 

Delete Catalog entry for OldPartition 

Create Catalog entry for NewPartition 

Update catalog entries and file labels for all other 

partitions of the Base Table. 

Release Lock on BaseTable. 

In Background, delete OldPartition and release disk 
space used by OldPartition file 

Conclude transaction. 



FIGURE 4B 
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Command: bpiiiPariiiion(Oiurartiiion, Cid Range, 
NewRangel, NewRange2, NewPartition) 

1 d*° 

Phase 1: "Dirty Copy" 

Create file for NewPartition. 

Create file label for NewPartition. 

Save position of last record in Audit Trail as AuditPtr. 

While allowing user transactions to continue accessing the 
specified OldPartition, copy all records of the OldPartition 
having a Primary Key value in NewRange2 into the 
NewPartition without regard to the fact that records in the 
OldPartition may be in the process of being changed. 



I ^232 

Phase 2: "Partial Cleanup" 

Set Filter for AuditTrail to access only audit trail records 
that pertain to OldPartition records with a Primary key 
value in NewRange2. 

Starting with the audit record in the Audit Trail at the initial 
AuditPtr value, repeat the following steps for each 
sequential audit record that passes the Filter until the end 
of the Audit Trail is reached: 

Modify the audit record to refer to the NewPartition. 
If the change to the table noted in the audit record is 
already reflected in the copied records. 

Ignore the modified audit record 
Otherwise 

Perform a "redo" with respect to the modified 
audit record so as to apply the change noted in 
the audit record to the NewPartition. This 
includes updating the B-tree for the 
NewPartition as needed to perform the change. 




FIGURE 5A 
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Phase 3: "Final Cleanup" 

Request and wait tor lock on the OldPartition. 

As a unitary transaction: 
Read each audit record in the Audit Trail that references 
the OldPartition and repeat the following steps for each 
sequential audit record that passes the Filter until the 
last such audit record has been processed: 

Modify the audit record to refer to the NewPartition. 
If the change to the table noted in the audit record 
is already reflected in the copied records. 

Ignore the audit record 
Otherwise 

Perform a "redo" with respect to the modified 
audit record so as to apply the change noted in 
the audit record to the NewPartition. This 
includes updating the B-tree for the 
NewPartition as needed to perform the change. 

Split B-tree for OldPartition into two parts, one for 
records In NewRangel and one for records in 
NewRange2. 

Make NewRange2 in OldPartion inaccessible. 
Request and wait for lock on the Base Table. 
Delete Catalog entry for OldPartition 
Create Catalog entry for NewPartition 
Update catalog entries and file labels for all other 
partitions of the Base Table. 
Release Lock on BaseTable. 
Conclude transaction. 

In Background, delete all records in OldPartition 
referenced by B-tree for records in NewRange2 



FIGURE 5B 



FIGURE 5A 
FIGURE SB 
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Command: MovePartitionBoundary(Partition1, OldRangel, 
NewRangel, Partition2, OldRange2, NewRange2) 

1 ^£40 

Phase 1: "Dirty Copy" 

Save position of last record in Audit Trail as AuditPtr. 
MoveRange = range of Primary Key values in NewRange2 
that are not in OldRange2. 

While allowing user transactions to continue accessing the 
specified Partitionl, copy all records of Partitionl having a 
Primary Key value in the MoveRange into Partition2 without 
regard to the fact that records in Partitionl may be in the 
process of being changed. 

i ^242 

Phase 2: "Partial Cleanup" 

Set Filter for AuditTrail to access only audit trail records 
that pertain to the Partitionl records with a Primary key 
value in MoveRange. 

Starting with the audit record in the Audit Trail at the initial 
AuditPtr value, repeat the following steps for each 
sequential audit record that passes the Filter until the end 
of the Audit Trail is reached: 

Modify the audit record to refer to Partition2. 

If the change to the table noted in the audit record 

is already reflected in the copied records. 
Ignore the modified audit record 

Otherwise 

Perform a "redo" with respect to the modified audit 
record so as to apply the change noted in the audit 
record to Partition2. This includes updating the 
B-tree for Partition2 as needed to perform the 
change. 

I 

FIGURE 6A 
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1 

Phase 3: "Final Cleanup" 

Request and wait for lock on the OldPartition. 
As a unitary transaction: 
Read each audit record in the Audit Trail and repeat the 
following steps for each sequential audit record that 
passes the Filter until the last such audit record has 
been processed: 

Modify the audit record to refer to the Partition2. 
If the change to the table noted in the audit record 
is already reflected in the copied records. 

Ignore the audit record 
Otherwise 

Perform a "redo" with respect to the modified 
audit record so as to apply the change noted in 
the audit record to the Partition2. This includes 
updating the B-tree for Partition2 as needed to 
perform the change. 

Split B-tree for Partition 1 into two parts, one for records 
in NewRangel and one for records in MoveRange. 
Make MoveRange in Partition 1 inaccessible. 
Request and wait for lock on the Base Table. 
Update catalog entries and file labels for Partitionl and 
Partition2. 

Update catalog entries and file labels for all other 
partitions of the Base Table. 
Release Lock on BaseTable. 
Conclude transaction. 

In Background, delete all records in Partitionl 
referenced by B-tree for records in MoveRange 



,-244 



FIGURE 6B 
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Command: Createindex(DaseTabie, AiiKeyCuiurnns, 
UniquelndexFlag, Newlndex) 



Phase 1 : "Dirty Copy" 

Save position of last record in Audit Trail as AuditPtr. 

While allowing user transactions to continue accessing the specified 
BaseTable, create Newlndex record for each record in the BaseTable: 




Create Newlndex Record 251 

AltKeyl = Create_AltKey(Record, AltKeyColumns) 
PrimaryKeyl = Create_PrimaryKey(Record, PnmarytKeyCoIumns) 
If UniquelndexFlag 
Prior to storing each Newlndex record, check to see if Newlndex 
already has a record with an alternate index value of AltKeyl . 
If one is found 

Check to see if (A) the record still has index values of 
AltKeyl and PrimaryKeyl, and (B) if there is another record 
in the BaseTable having an alternate key value of AltKeyl. 
If both A and B are true 

Abort Createlndex - conflicting database records found 
Else 

Store new index record in Newlndex 

Else 

Store new index record in Newlndex 
Else /* UniquelndexFlag = False 7 
Store new index record in Newlndex 





FIGURE 7A 
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FIGURE 7B 
FIGURE 7C 

FIGURE 7 
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Phase 2: "Partial Cleanup" 

Set Filter for AuditTrail to access only audit trail records 
that pertain to the BaseTable. 

Starting with the audit record in the Audit Trail at the initial 
AuditPtr value, repeat the following steps for each 
sequential audit record that passes the Filter until the end 
of the Audit Trail is reached: 

Call Audit Fixup Procedure for Create Index ^253 

Audit Fixup Procedure for Create Index 

If the audit record denotes a change to a base table record 
that does not alter either the Primary Key or the AltKey of 
the record, or is already reflected in the Newlndex 

Ignore the audit record 
Elseif the audit record denotes deletion of a base table 
record 

Create an Audit Record denoting deletion of associated 

Newlndex record, if one exists 
Elseif the audit record denotes addition of a base table record 

Create an Audit Record denoting creation of Newlndex 

Record (using same procedure as in Phase 1) 
Else r audit record denotes alteration of a base table record 7 

Create an Audit Record denoting deletion of the 

Newlndex record, if one exists, for altered record's prior 

alternate key value 

Create an Audit Record denoting creation of a Newlndex 
Record for altered record 

Perform a "redo" of the created audit record(s) so as to apply 
changes to the index that are needed for the change to the 
BaseTable, as recorded by the audit record. Update the 
B-tree for the index as needed to perform the changes. 
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Phase 3: "Final Cleanup" 

Request and wait for lock on the Base Table. 

As a unitary transaction: 
Read each audit record in the Audit Trail, starting with 
the first record created after the end of Phase 2, and 
repeat the following steps for each sequential audit 
record that passes the Filter until the last such audit 
record has been processed: 

Call Audit Fixup Procedure for Create Index 

Create Catalog entry for Newlndex 

Update file labels for all partitions of the Base Table. 

Release Lock on BaseTable. 

Conclude transaction. 
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